Hi.

I have a regression of Y on a bunch of Xs (always observed) and on Z 
(sometimes missing).

The X's will be used to impute Z. But should Y also be used in imputing Z?

My reading of the literature suggests that's not a problem and can often be 
a good thing in terms of gaining precision. A colleague argues that using 
the outcome to impute the predictor, will bias the estimated effect of that 
predictor in the main regression model. She argues that, by using Y, 
"you're stacking the deck, so to speak", ie, the imputation determines what 
you'll find out in the main regression model.

Is there a heuristic response to that concern?
(Or, if I'm wrong, please someone correct me!)

Thanks,
cd

PS  Always assuming MAR of Z (ie, missingness of Z does not depend on the 
unobserved Z itself).



________________________________________________________________

Constantine Daskalakis, ScD
Assistant Professor,
Biostatistics Section, Thomas Jefferson University,
125 S. 9th St. #402, Philadelphia, PA 19107
    Tel: 215-955-5695
    Fax: 215-503-3804
    Email: [email protected]
    Webpage: http://www.kcc.tju.edu/Science/SharedFacilities/Biostatistics

Reply via email to