By Eirini Saloniki, Research Associate at Centre for Health Services Studies & Personal Social Services Research Unit, University of Kent,
Health preferences research is now a substantial field in itself, known not least for its contribution to assessing different health and care services by measuring the outcomes that people experience in using these services. Well-known and comprehensive indicators are used to measure these outcomes whereby people place different values on the different ways that services (can) impact their quality of life.
Preference studies have traditionally used face-to-face (paper-and-pencil or computer-assisted) interviews to gather such data. Despite their good completion rates and reliability, face-to-face interviews are costly and time-consuming. A shift towards internet surveys has been gaining popularity, allowing for accurate recording of response time and for targeting groups of respondents quickly at lower cost. Nevertheless, internet surveys have also been criticised, with poor data quality and difficulty in achieving sample representativeness being the main concerns. Can the advantages and disadvantages of the different methods of data collection be a source of variation in the results? Yes, not only as a result of how someone responds, but also because of who responds to each survey.
Several studies in environmental and health economics have compared preferences elicited from face-to-face interviews and internet surveys, but the evidence is mixed. The discrete choice experiment (DCE) technique was commonly used amongst the different studies to elicit preferences while the sampling process varied considerably. Another technique – best-worst scaling (BWS) – known for its lower cognitive burden compared to the DCE, has not been used to compare preferences elicited from different data collection methods. Nor have preferences been compared across the different methods in the context of long-term care.
In this paper, for the first time, we compare preferences elicited from face-to-face and internet surveys for the best-worst scaling (BWS) task using the Adult Social Care Outcomes Toolkit (ASCOT) service user measure. ASCOT measures social care-related quality of life (SCRQoL) across eight domains (accommodation cleanliness and comfort, safety, food and drink, personal care, control over daily life, social participation and involvement, dignity, occupation and employment), and has been recommended by NICE for use in the economic evaluation of social care services. This paper is part of a larger study (EXCELC) that aimed to establish relative preferences for care-related outcomes for people using long-term care.
The BWS experiment involving the ASCOT measure was included in a face-to-face (n=500) and an online (n=1001) survey and was completed by a sample targeted to be representative of the general population in England. Each respondent in both surveys was presented with a set of eight hypothetical scenarios. Each scenario contained eight attributes reflecting the eight ASCOT domains. In each scenario, the respondent was firstly asked to select the best (or most preferred) choice, with the selected choice being greyed out. The same process was repeated for the worst (or least preferred) choice, the second best, and second worst choices. Thus, each respondent made in total 32 choices.
We used a multinomial logit framework to analyse the data, ensuring that it is appropriate to pool the two datasets (internet and face-to-face). The initial pooled model assumed that there are no differences in observable characteristics between the two samples (i.e. no taste heterogeneity exists), but it included a parameter to control for scale differences between the two datasets. An additional (pooled) model controlled for differences in the sample composition of the two datasets as well as scale heterogeneity between different groups of respondents across the two datasets. All models took into account the repeated nature of the data.
The results from our first pooled model revealed several but small differences in preferences for SCRQoL across the two methods of data collection, with half of the coefficient differences compared (15 out of 30) found to be statistically significant at the 5% level. The number of significant differences compared reduced substantially to five when further controlling for observable and unobservable characteristics between the two samples. The limited (and small in value) significant differences were mostly at attribute levels indicating higher needs. A number of significant scale effects were also identified – for instance, participants were less certain when making their worst choices than when making their best choices which relates to framing effects reported in the preferences literature.
Overall, we find fairly similar preferences for SCRQoL between the two methods of data collection, suggesting that we can be confident enough in the internet results from a practical point of view. There is scope for future studies to not only explore a different sampling frame, but also consider providing sufficient clarifications to internet respondents in an attempt to minimise the level of uncertainty in the choice process for this group.
For a copy of the full paper please see: http://dx.doi.org/10.1007/s11136-019-02172-2