Yes.  This makes sense.  If the variable used to impute missing data is already 
in the final model, it will probably add little to use it to improve imputation 
of another covariate in the final model with missing data (as also suggested by 
Alan Zazlavsky).  (It does keep the case with the missing data in the final 
analysis and may help simply by increasing the effective N.)  But if the 
accessory variable isn't in the final model, either because of collinearity and 
a need to limit design df (as in my straw model) or because of known biases in 
the variable as In your example below, then including this information in the 
imputation may have an important impact on the final analysis.

If you still have the data set that you refer to below, it might be instructive 
to present this as a demonstration of the potential benefits of imputation 
using auxiliary covariates for imputation - even if there is no market for the 
underlying analysis itself.

Larry Hunsicker

From: David Judkins [mailto:david_judk...@abtassoc.com]
Sent: Tuesday, April 16, 2013 8:36 AM
To: Hunsicker, Lawrence
Cc: IMPUTE@LISTSERV.IT.NORTHWESTERN.EDU; paulvonhippel.utaus...@gmail.com
Subject: Re: "Accessory" variables in imputation

I think that the impact on the variance of target parameters of using a class 
of variables in the imputation will be stronger for the class of adjunct 
variables than for the class of causally prior covariates in the target model. 
Parallel or alternate outcomes are particularly good examples of this. People 
who favor nesting variables with nonresponse within flags for missingness as an 
alternative to imputation fail to realize these gains in precision and possibly 
in bias reduction. (Obviously, they cannot include parallel outcomes in their 
analytic models.)

It harkens back to one of the central themes in the debate between imputation 
and ANCOVA. The imputer frequently has access to a richer set of auxiliary 
information than does the downstream analyst. If we are shy about using that 
information in the imputation, then we have surrendered most of the advantage 
of imputation over the alternatives.

To give an example from my own work, I had a longitudinal sample of 8th graders 
with parent interviews for the fall of the normative freshman year of college. 
Parent nonresponse was high with the 4.5 year gap. The primary outcome of 
interest was college admission. We matched students to administrative datasets 
about college going.  We could not report the administrative data directly 
because of known biases (e.g. no coverage of children from families who do not 
require financial aid).  Using the match status as an adjunct variable in 
imputation of the parent responses, however, had a huge impact on the final 
estimates. In addition to strong variance reduction, we also discovered that 
parent nonresponse was strongly nonignorable. Those whose children did not go 
to college were far less likely to respond to the survey. I would send you a 
reference but unfortunately the evaluation was cancelled without a report.

--Dave Judkins




________________________________
Notice: This UI Health Care e-mail (including attachments) is covered by the 
Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and 
may be legally privileged.  If you are not the intended recipient, you are 
hereby notified that any retention, dissemination, distribution, or copying of 
this communication is strictly prohibited.  Please reply to the sender that you 
have received the message in error, then delete it.  Thank you.
________________________________

Reply via email to