Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

It is now widely accepted that multiple imputation (MI) methods properly handle the uncertainty of missing data over single imputation methods. Several standard statistical software packages, such as SAS, R and STATA, have standard procedures or user-written programs to perform MI. The performance of these packages is generally acceptable for most types of data. However, it is unclear whether these applications are appropriate for imputing data with a large proportion of zero values resulting in a semi-continuous distribution. In addition, it is not clear whether the use of these applications is suitable when the distribution of the data needs to be preserved for subsequent analysis. This article reports the findings of a simulation study carried out to evaluate the performance of the MI procedures for handling semi-continuous data within these statistical packages. Complete resource use data on 1060 participants from a large randomized clinical trial were used as the simulation population from which 500 bootstrap samples were obtained and missing data imposed. The findings of this study showed differences in the performance of the MI programs when imputing semi-continuous data. Caution should be exercised when deciding which program should perform MI on this type of data.

Original publication





Stat Methods Med Res

Publication Date





243 - 258


Bias, Computer Simulation, Data Interpretation, Statistical, Randomized Controlled Trials as Topic, Software, United Kingdom