Abstract
Most load applied to the ankle is transferred through the talar dome [1], and motion of the ankle joint primarily occurs in the sagittal plane as planti- and dorsiflexion of the tibiotalar joint. A normal range of motion (ROM) is 10-20° of dorsiflexion and 40-55° of plantiflexion [2]. A decrease in this ROM affects maintaining a normal gait cycle and can lead to compensation elsewhere in the lower limb [2]. Many degenerative and inflammatory conditions, injuries and neurological conditions result in reduced ROM of the ankle.
The use of patient-reported outcome measures (PROM) in medical practice is increasing. PROMs are primarily used to evaluate groups of patients, e.g., in treatment series or randomised studies and for quality control, screening and patient treatment monitoring [3].
There are numerous foot and ankle PROMs [4]. An analysis of the 17 mostly used questionnaires revealed that none evaluate ROM of the ankle directly [5], and only the Munich Ankle Questionnaire evaluated this [6], using schematic drawings for assessment. However, it does not evaluate ankle dorsiflexion with the knee flexed, and therefore it cannot separate a tight gastrocnemius muscle from conditions in the ankle joint as a reason for reduced dorsiflexion.
The need for patient physical attendance in the outpatient clinic for treatment monitoring and research can be reduced if ankle ROM is assessed through a pictorial questionnaire. Therefore, this study aimed to develop and validate a comprehensive tool to assess ROM in the ankle as a complement to information from existing PROMs that primarily evaluate the subjective ankle condition. We hypothesised that a pictorial tool would be able to identify patients who have reduced ankle motion and need attention, for instance, in relation to post-operative follow-up.
Methods
Development and validation of the Copenhagen Ankle Range of Motion Scale
The regional ethical committee waived the need for ethical approval as the study involved no interventions.
A pictorial questionnaire was developed using colour side-view photographs of an ankle, showing various degrees of extension and flexion with straight and flexed knees. We produced colour side-view photographs to show the ankle’s end-range motion during maximal dorsiflexion, with the knee either fully extended or flexed at 30°, capturing 15° increments. Additionally, we photographed maximal plantarflexion (with the knee fully extended), capturing 10° increments. To address content and face validity (relevance, coverage and understandability [7, 8] of the photographs, eight medical professionals: two orthopaedic nurses, two resident doctors, three foot and ankle surgeons and one sports surgeon discussed every photograph, including the colour of the socks, light on the photographs, etc., followed by adjustments of the photographs. A provisional questionnaire was developed, and 19 patients with various foot or ankle disorders, attending the outpatient clinic, were invited to complete and comment on the scale. Their inputs were discussed among the group of medical professionals, and the scale was then adjusted accordingly, for example, in relation to the layout, the colour of the socks and the lighting in the photographs, and the size of the photographs, and was then ready for validation.
The Copenhagen Ankle Range of Motion Scale (CARS) (supplementary Figure 1) consisted of 12 photographs, showing three movements of the ankle: plantiflexion (PF) with an extended knee (six images), dorsiflexion with an extended knee (DFEK) (three images) and dorsiflexion with flexed knee (DFFK) (four images). For each of the three movements, the patient had to select the photo that illustrated the maximal ROM in their ankle.
The validity of CARS, meaning to which degree patients and healthcare professionals (an experienced nurse and an experienced foot/ankle surgeon) selected the same photos and how the patients’ choices of photos correlated with the dorsi- and plantarflexion [9] as measured by a nurse and a doctor, was evaluated with agreement and correlation analyses.
Inter-rater reliability of the measurements made by a nurse and a doctor - meaning the extent to which both agreed on the ROM of the ankle [10] - was evaluated by agreement analysis.
Study population
After having consented to participate, a total of 102 patients with a wide range of foot and ankle pathologies participated in the validity study from 28 September to 21 December 2020 in connection with their consultation at Bispebjerg Hospital’s Foot and Ankle Outpatient Clinic.
When they checked in at the outpatient clinic, the participants received CARS in paper format. They were left alone in an undisturbed room to complete the questionnaire. Relatives were allowed to accompany and help, but the patients received no help or explanation from any healthcare professional regarding the questionnaire.
When the patient had completed CARS, the foot/ankle surgeon entered the room and completed a separate CARS for the ankle in focus but did not discuss the completion of the questionnaire with the patient, and then the surgeon measured the ankle ROM using a predefined technique: to identify the lower leg axis, the middle of the fibular head to the middle of the lateral malleolus was outlined with a marker. One wing of a universal goniometer [11] was positioned along the line, and the other wing along the sole of the foot (Figure 1). Three active movements were measured in the ankle: maximal active dorsiflexion and maximal PF with an extended knee, and maximal active dorsiflexion with a flexed knee achieved during maximum lunge in a standing position.
Finally, a trained nurse completed a third CARS questionnaire and measured the ankle ROM, adopting the procedure described above. The doctor and the nurse did not know each other’s measurement results.
Statistical analysis
We calculated descriptive statistics (mean, standard deviation, frequencies and 95% CI) – parametric statistics were used as the results showed a normal distribution.
The gold standard goniometer measurement for the individual patient for each of the three ROMs was defined as the mean of the values obtained by the doctor and nurse.
To relate the pictorial choices in the questionnaire to the goniometer measurements and to define a “calculated correct option”, we used the interquartile range of the goniometer measurement from both interpreters and assigned it to the corresponding pictorial choice. Thus, for each pictorial choice, we assigned a range of degrees within which the interquartile range of measurements for both the doctor’s and nurse’s goniometer measurements fell.
In cases where we encountered minimal ROM pictorial choices that had a limited number of subjects, a predefined increment in degrees for that specific ROM was chosen.
For agreement, we used weighted kappa, Bland-Altman plots and interclass correlation coefficient (ICC) in combination or separately, depending on the variable [12]. For correlation between picture selection and goniometer measurements, we calculated the Pearson correlation coefficient.
Data were processed in RStudio.
Trial registration: not relevant.
Results
The mean patient age was 50.9 years (range: 17-87, standard deviation (SD): ± 15.2), with 31 males and 71 females.
Five patients marked two options at the extreme ends of the PF section. When asked them to explain this, and the patients stated that they had understood the six pictures as two separate sets of three, and they suggested marking the pictures with numbers to avoid this misunderstanding (this change was therefore implemented in CARS after the validation process). These patients were excluded from the analysis. One patient had misunderstood two questions and marked the maximal motion in the wrong extreme of the scale. This mistake is relevant for the evaluation of the understandability of the scale. Therefore, we calculated the outcome parameters for all patients, including the obvious misunderstanding of the scale. The mean difference in goniometer measurements made by the doctor and the nurse for PF was 0.9° (95% CI: –0.4-2.3°). For DFEK, it was 0.1° (95% CI: –1.1-1.4°) and for DFFK, it was –0.8° (95% CI: –1.9-0.4°). The ICC results and standard deviations are presented in Table 1. The Bland-Altman plots in Supplementary Figure 1 illustrate the agreement within the recorded goniometer readings.
The agreement between the doctor and the nurse in relation to the picture selection was substantial (Table 2), while the agreement between the calculated correct answer and the patient’s choice was fair to moderate (Table 2). One patient chose the first pictorial option in PF (minimal PF), and one patient chose the second. The correct picture was chosen by 40.5% of patients, 94% were within one pictorial option away from the correct option, five patients were two pictorial options from the correct option, and one patient was at the extreme end away from the correct answer (Supplementary Table 1). The weighted kappa for PF was 0.33 (95% CI: 0.15-0.51), meaning that agreement was fair.
In the DFEK, only three patients chose the first picture (maximal equinus). In total, 77% of patients chose the correct pictorial option, 99% were within one pictorial option of the correct option, and one patient was two pictorial options from the correct option (Supplementary table 1). The weighted kappa for DFEK was 0.5 (0.33-0.67).
Only three patients chose the first pictorial option (maximal equinus) for the DFFK. A total of 64% of patients chose the correct pictorial option, and 95% were within one pictorial option of the correct option (Supplementary Table 1). The weighted kappa for DFFK was moderate (0.4; 95% CI: 0.22-0.58).
Patient choices did correlate with the mean goniometer measurements, but it was not possible to identify cut-off points in terms of the progression of the pictorial patient selection and the corresponding increase in the obtained ROM in goniometer measurement (Table 3). The Pearson correlation coefficient between picture selection and the mean goniometer measurements for patients was 0.56 (95% CI: 0.40-0.68) for PF, 0.53 (95% CI: 0.37-0.66) for DFEK and 0.51 (95% CI: 0.35-0.64) for DFFK. Selections made by the doctor and the nurse correlated better with the goniometer measurements than those made by patients; (0.64-0.81) for the doctor and (0.67-0.78) for the nurse.
In the Supplementary Files, the Danish validated version of CARS is presented with numbers added to the individual pictures after validation, as suggested by patients. We have also added an English translation, which has not been translated in a standardised way or been validated. Thus, the English version is not valid for scientific use or in the daily clinic.
Discussion
This study shows that the CARS questionnaire can be used to obtain an indication of ankle ROM without the physical attendance of the patient. It was developed and adapted by the relevant involvement of professionals and patients. It was validated thoroughly, and the correlation between patients’ pictorial choices and the mean goniometer measurements made by professionals was fair to moderate [13]. However, doctors and nurses were able to select the picture corresponding to the gold-goniometer measurements more accurately than patients, and there was good agreement between the pictorial choices made by doctors and nurses [14]. In particular, the middle spectrum of motion was complex for patients to identify in the pictures, as choices 4 and 5 in PF and 2 and 3 in dorsiflexion with a flexed knee were often used for the same degree of motion (Table 3). One patient misunderstood the scale and turned the extremes around, indicating severe reduction in ROM while the ankle was, in fact, moving normally (Table 3). The numbering of pictures, added in response to the uncertainties identified during testing of CARS, can hopefully avoid such mistakes. For the individual patient, the risk of overlooking a severe reduction in motion is therefore low when CARS is used as a measurement tool.
In clinical practice, a visual estimate instead of a goniometer measurement is often used to describe ankle ROM, and how this is transferred to degrees may differ from person to person. Even when a goniometer is used, the interobserver agreement for ankle ROM measures is poor [11] while moderate for CARS, and it is suggested that the use of CARS in daily practice to indicate ankle ROM might function more uniformly.
Table 3 shows obscured thresholds for the pictorial choices in relation to goniometer measurements. This means that CARS failed to define cut-off points between each pictorial selection and the corresponding goniometer measurement, despite the good correlation between them, and it may therefore be meaningful to reduce the photos for PF and DFFK to three choices each and DFEK to two. A similar reduction in the number of pictorial choices was suggested in relation to the Copenhagen Knee ROM Scale, as it was difficult for patients to choose between pictures with a less than 15° increment [15]. In addition, modifying the pictures with additional patient input may enhance the correlation between CARS and goniometer measurements in version 2.0.
Based on the validity assessment, we suggest that the first and second pictorial options in PF and DFFK, and the first pictorial option in DFEK, be used to identify patients who may need attention because of severe stiffness of the ankle joint.
Although goniometers demonstrate good intra-rater reliability, inter-rater reliability consistently falls short [11]. One reason for this is that the landmarks used when the goniometer is positioned to some degree are covered by soft tissues of textures varying between individual patients. The use of an inclinometer has been suggested to increase measurement reliability [16]. However, this is more time-consuming and may not be practical in a busy clinical setting. In addition, clinical measures are probably performed with less standardised methods than in the present study, which will contribute to more inconsistency between raters and may cause a lower inter-rater reliability. The standardised nature of the CARS makes it less susceptible to local variation in the precision of the measurement. If CARS is used as an outcome measure in clinical studies, the obscured thresholds between pictures will introduce a type-2 error. But this is also the case if goniometer measures from several observers are used [11].
Limitations
It is a significant limitation that we were unable to identify and include patients with poor ankle ROM, which means that it was not possible to calculate the sensitivity, specificity and positive/negative predictive values of the tool. Also, test-retest reliability was not assessed. It is suggested that these issues be addressed in a future CARS validation study.
Conclusions
The Copenhagen Ankle ROM Scale questionnaire can be used to obtain an indication of ankle ROM without the physical attendance of the patient and can be used to standardise ankle ROM measures in clinical practice. However, because of weak thresholds to distinguish between pictorial options and the corresponding goniometer measurement, individual measurements by the CARS must be regarded as rough indications of ROM. It was not possible to calculate the sensitivity, specificity and positive/negative predictive values of the tool. Further development with the involvement of patients with poor ROM may improve the validity of CARS version 2.0.
Correspondence Michael Rindom Krogsgaard. E-mail: Michael.Rindom.Krogsgaard@Regionh.dk
Accepted 9 May 2025
Published 22 July 2025
Conflicts of interest none. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. These are available together with the article at ugeskriftet.dk/dmj
References can be found with the article at ugeskriftet.dk/dmj
Cite this as Dan Med J 2025;72(8):A06240381
doi 10.61409/A06240381
Open Access under Creative Commons License CC BY-NC-ND 4.0
Referencer
- Michael JM, Golshani A, Gargac S, Goswami T. Biomechanics of the ankle joint and clinical outcomes of total ankle replacement. J Mech Behav Biomed Mater. 2008;1(4):276-294. https://doi.org/10.1016/j.jmbbm.2008.01.005
- Brockett CL, Chapman GJ. Biomechanics of the ankle. Orthop Trauma. 2016;30(3):232-238. https://doi.org/10.1016/j.mporth.2016.04.015
- Greenhalgh J. The applications of PROs in clinical practice: what are they, do they work, and why? Qual Life Res. 2009;18(1):115-123. https://doi.org/10.1007/s11136-008-9430-6
- Riskowski JL, Hagedorn TJ, Hannan MT. Measures of foot function, foot health, and foot pain: American Academy of Orthopedic Surgeons Lower Limb Outcomes Assessment: Foot and Ankle Module (AAOS-FAM), Bristol Foot Score (BFS), Revised Foot Function Index (FFI-R), Foot Health Status Questionnaire (FHSQ), Manchester Foot Pain and Disability Index (MFPDI), Podiatric Health Questionnaire (PHQ), and Rowan Foot Pain Assessment (ROFPAQ). Arthritis Care Res (Hoboken). 2011;63(suppl 11):S229-S239. https://doi.org/10.1002/acr.20554
- Hansen CF, Obionu KC, Comins JD, Krogsgaard MR. Patient-reported outcome measures for ankle instability. An analysis of 17 existing questionnaires. Foot Ankle Surg. 2022;28(3):288-293. https://doi.org/10.1016/j.fas.2021.04.009
- Greve F, Braun KF, Vitzthum V, et al. The Munich Ankle Questionnaire (MAQ): a self-assessment tool for a comprehensive evaluation of ankle disorders. Eur J Med Res. 2018;23(1):46. https://doi.org/10.1186/s40001-018-0344-7
- Comins JD, Brodersen J, Siersma V, et al. How to develop a condition-specific PROM. Scand J Med Sci Sports. 2021;31(6):1216-1224. https://doi.org/10.1111/sms.13868
- Mokkink LB, Prinsen CA, Bouter LM, et al. The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) and how to select an outcome measurement instrument. Braz J Phys Ther. 2016;20(2):105-113. https://doi.org/10.1590/bjpt-rbf.2014.0143
- Streiner DL, Norman GR, Cairney J. Health measurement scales: a practical guide to their development and use. Oxford University Press, 2014. https://doi.org/10.1093/med/9780199685219.001.0001
- Lange RT. Inter-rater reliability. In: Kreutzer JS, DeLuca J, Caplan B, eds. Encyclopedia of clinical neuropsychology. New York: Springer, 2011:1348. https://doi.org/10.1007/978-0-387-79948-3_1203
- Youdas JW, Bogard CL, Suman VJ. Reliability of goniometric measurements and visual estimates of ankle joint active range of motion obtained in a clinical setting. Arch Phys Med Rehabil. 1993;74(10):1113-1118. https://doi.org/10.1016/0003-9993(93)90071-H
- Watson PF, Petrie A. Method agreement analysis: a review of correct methodology. Theriogenology. 2010;73(9):1167-1179. https://doi.org/10.1016/j.theriogenology.2010.01.003
- Akoglu H. User's guide to correlation coefficients. Turk J Emerg Med. 2018;18(3):91-93. https://doi.org/10.1016/j.tjem.2018.08.001
- Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163. https://doi.org/10.1016/j.jcm.2016.02.012
- Mørup-Petersen A, Holm PM, Holm CE, et al. Knee osteoarthritis patients can provide useful estimates of passive knee range of motion: development and validation of the Copenhagen Knee ROM Scale. J Arthroplasty. 2018;33(9):2875-2883.e3. https://doi.org/10.1016/j.arth.2018.05.011
- Dickson D, Hollman-Gage K, Ojofeitimi S, Bronner S. Comparison of functional ankle motion measures in modern dancers. J Dance Med Sci. 2012;16(3):116-125. https://doi.org/10.1177/1089313X1201600305