Dear all,

I have been attempting to use multiple imputation (MI) to handle missing data 
in my study. I use the mice package in R for this. The deeper I get into this 
process, the more I realize I first need to understand some basic concepts 
which I hope you can help me with.

For example, let us consider two arbitrary variables in my study that have the 
following missingness pattern:

Variable 1 available, Variable 2 available: 51 (of 118 observations, 43%)
Variable 1 available, Variable 2 missing: 37 (31,3%)
Variable 1 missing, Variable 2 available: 10 (8,4%)
Variable 1 missing, Variable 2 missing: 20 (16,9%)

I am interested in the correlation between Variable 1 and Variable 2.

Q1. Does it even make sense for me to use MI (or anything else, really) to 
replace my missing data when such large fractions are not available?

Plot 1 (http://imgur.com/KFV9y&CmV1sl) provides a scatter plot of these example 
variables in the original data. The correlation coefficient r = -0.34 and p = 
0.016.

Q2. I notice that correlations between variables in imputed data (pooled 
estimates over all imputations) are much lower and less significant than the 
correlations in the original data. For this example, the pooled estimates for 
the imputed data show r = -0.11 and p = 0.22.

Since this seems to happen in all the variable combinations that I have looked 
at, I would like to know if MI is known to have this behavior, or whether this 
is specific to my imputation. 

Q3. When going through the imputations, the distribution of the individual 
variables (min, max, mean, etc.) matches the original data. However, 
correlations and least-square line fits vary quite a bit from imputation to 
imputation (see Plot 2, http://imgur.com/KFV9yl&CmV1s). Is this normal?

Q4. Since my results differ (quite significantly) between the original and 
imputed data, which one should I trust?

Thank you for your help in advance.
Tina
--

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to