I agree in principle with Rod, but I think it is misleading to say "it is better". To my understanding, it is possible to say, what is correct and what is incorrect. So I would say:
1) If we use conditional mean imputation, it is incorrect to condition on Y. 2) If we use multiple imputation, it is incorrect not to condition on Y. The basic argument for 1) is, that if we fit a generalized linear model with identical link function (e.g. of we use OLS to fit a regression model), and if our estimates for the conditional means are consistent (i.e. if the model we have used to estimate these mean values is correctly specified), then we obtain consistent estimates of the regression coefficients. If we condition on Y, we loose this property. In the case of other link functions like a logit link used in logistic regression, we loose the consistency properties, but the papers mentioned by Frank Harrell suggest, that the bias is neglegible. These papers also suggest, that one can trust in confidence intervals, as long as one avoid extreme constellations. One must be of course aware, that conditional mean imputation does not solve all problems. For example one cannot trust in the RSME or in the R^2. The basic argument for 2) is, that the theory of multiple imputation requires this. I am not an expert in this field, but I am pretty sure, that avoiding to condition on Y yields biased estimates of the regression coefficients. Werner -- Werner Vach Institut for Statistik og Demografi (Department of Statistics and Demography) Syddansk Universitet (University of Southern Denmark) Mailing address: Campusvej 55, DK-5230 Odense M, Denmark Visitors' address: Sdr. Boulevard 23A, 3rd. floor, 5000 Odense C Phone: +45 65 50 33 83 Fax: +45 65 95 77 66 email: [email protected] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20020524/1bdb64ff/attachment.htm From allen_bingham <@t> fishgame.state.ak.us Fri May 24 10:34:33 2002 From: allen_bingham <@t> fishgame.state.ak.us (Allen Bingham) Date: Sun Jun 26 08:24:59 2005 Subject: IMPUTE: Re: beginner with MI _ help!!!! In-Reply-To: <[email protected]> Message-ID: <001001c20338$81bd8b90$4f9e3...@sfrtsbingham> FYI, This paper is available for download from Joe Schafer's site, at: http://www.stat.psu.edu/~jls/mbr.pdf for the Adobe pdf version or: http://www.stat.psu.edu/~jls/mbr.ps for a postscript version. Allen B. -----Original Message----- From: [email protected] [mailto:[email protected]]on Behalf Of Jeff Wayman Sent: Thursday, May 23, 2002 1:28 PM To: [email protected] Cc: [email protected] Subject: IMPUTE: Re: beginner with MI _ help!!!! Hi, You might try "Multiple imputation for multivariate missing-data problems: a data analyst's perspective", by Schafer and Olsen (1998). It won't solve all your problems, but is a good general reference and should move you forward, anyway. Jeff >Hi > >I'm a complete novice with regards to MI and have been reading around the >subject to see if it could be utilised for my PhD project. However, I have >been unable to find any information (that is clear, and relatively easy for >a non-statistician to understand) or examples on how to combine the results >from ANOVA's after MI. > >I understand the regression stuff to a certain extent, because the >estimands are straightforward, but with ANOVA I'm not sure which output >stats are to be combined, how they should be combined, and how to account >for the additional uncertainty from the multiple imputed datasets. This is >likely to be complicated by the fact that i want to conduct repeated >measures ANOVA's (one within, one between, unbalanced design) which will >produce many different effects. Further how do i combine the results of the >post hoc comparison tests? > >I've tried to get the answers from the Rubin and Schafer books, but i can't >really get my head around them without feeling that i need to take a stats >course. at the moment i'm feeling immensely dense, and am wondering if all >the extra effort required for MI is worth it. This is especially so when >there are many different types of analyses that i am required to conduct on >my imputed datasets (large number of ANOVA's, correlations, logistic >regression, multiple regression, eventually SEM). > >I need to get some simple examples and explanations on combining the >outcomes from the m analyses, in different types of analyses to make any >progress. It all seemed so simple when i first read about MI, but now i'm >trying to implement the technique, i'm left wondering if the gains to be >made by MI over single imputation are worth it. (If all else fails, i think >the only option is to use one imputation and run the required analyses). > >If anybody out there can help or guide me to a source which can explain in >a simple and clear manner what i need to do and how, i'd be extremely >grateful. Thanks in advance. > >getting desperate and running out of time. > >shash > >Shashivadan Hirani >University College London Medical School >Department of Psychiatry and Behavioural Science >2nd Floor, Wolfson Building >48 Riding House Street >London W1N 8AA > >Tel: 020 7679 9309 >Mobile: 07736 129648 >
