I agree. In one of our clinical studies, data based on blood work was missing because of some technical and informed consent isues. But we had collected a auxiliary variable using dietary questionnaire. It was a good predictor of blood-work based variable (r=0.6). Through multiple imputation we were able to reduce the fraction of missing information considerably. Often, people think of missing data after the data collection is over. I think we need to think of potential missing data before the data collection and try to collect auxiliary variables that are predictive of variables that are likely to have missing values.
Raghu On Mon, Apr 15, 2013 at 8:21 PM, David Judkins <[email protected]>wrote: > I would say that it all depends. In Hunsicker's example, peak PRA sounds > like it was excluded from the outcome space because of colinearity issues. > This makes it an ideal adjunct variable to the imputation process. > > --Dave Judkins > > Sent from my iPhone > > On Apr 15, 2013, at 7:13 PM, "Paul von Hippel" <[email protected]> > wrote: > > Let me correct my first sentence: What I meant to say is that Meng > showed that MI imputation is still valid of auxiliary variables have been > included in the imputation model. So it's a legitimate practice and, if > its' not too much trouble, why not. But it probably won't make much > difference. > > > ------------------------------ > *From:* Paul von Hippel <[email protected]> > *To:* [email protected] > *Sent:* Monday, April 15, 2013 4:39 PM > *Subject:* Re: "Accessory" variables in imputation > > Meng showed that MI imputation is still valid if auxiliary variables > have been included in the analysis. In theory auxiliary variables can > improve the estimates, but in practice they rarely help much. See the > recent paper by Sarah Mustillo in Sociological Methods & Research. > > > On Mon, Apr 15, 2013 at 4:27 PM, Hunsicker, Lawrence < > [email protected]> wrote: > > Good afternoon, all: > > A question about the use of "accessory" variables in imputation. Consider > for a moment a kidney transplant survival model in which one has data > (among other things) on peak panel reactive antibody (peak PRA) and the PRA > at the time of the actual transplant (current PRA). These actually measure > different things, but they are obviously strongly correlated. Data are > missing of some fraction of these covariates, but most of the time one or > the other is available. Current PRA is considered to be the stronger > predictor of transplant outcomes. One is developing a model in which one > wants to limit the model df. So it has been decided that the final model > will include current PRA but not peak PRA. > > I understand that the imputation model must include the outcome variable > and also all of the covariates that will be used in the final analysis > model. The question is whether one can/should include additional > covariates (such as peak PRA) in the imputation model that WON'T be in the > final analysis model. It would seem that inclusion of peak PRA in the > imputation model might improve considerably the prediction of current PRA, > the covariate that will be included in the final analysis model. > > Is this legitimate? > > Thanks in advance to any guidance from the listserv members. > > Larry Hunsicker > Prof. Internal Medicine > U. Iowa College of Medicine > > > ________________________________ > Notice: This UI Health Care e-mail (including attachments) is covered by > the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is > confidential and may be legally privileged. If you are not the intended > recipient, you are hereby notified that any retention, dissemination, > distribution, or copying of this communication is strictly prohibited. > Please reply to the sender that you have received the message in error, > then delete it. Thank you. > ________________________________ > > > > > -- > Best wishes, > Paul von Hippel > Assistant Professor > LBJ School of Public Affairs > Sid Richardson Hall 3.251 > University of Texas, Austin > 2315 Red River, Box Y > Austin, TX 78712 > (512) 537-8112 > > > > > ------------------------------ > This message may contain privileged and confidential information intended > solely for the addressee. Please do not read, disseminate or copy it unless > you are the intended recipient. If this message has been received in error, > we kindly ask that you notify the sender immediately by return email and > delete all copies of the message from your system. > -- Trivellore Raghunathan (Raghu) Chair and Professor of Biostatistics School of Public Health Room M4208 1415 Washington Heights University of Michigan Ann Arbor, MI 48109 Phone: (734) 615-9832 Fax: (734) 615-7068 "A good life is filled with selfless actions full of compassion knowing well that we are all one"
