Cc: [email protected]
Subject: Imputation with regression

The problem in imputation under a model with interactions is that the
relationships are not linear, as is assumed under the normal model. Rubinn
has commented (in a seminar, I don't know whether this is published) that
you might just stick in the product of variables as another variable and
then the imputation of the missing values would essentially use the linear
approximation to the product.  If the interaction is very important model,
this might not be such a great approximation.  Another approach might be
that used in IVEware (from U of Michigan), which uses a collection of
univariate regression models to impute in turn. This gives some additional
flexibility in specifying each model, for example by including some
interactions in predicting some variables from others, although the
collection of models is likely not to be entirely consistent.  Specifying
a consistent joint model with interactions in the conditional (regression)
models is not at all obvious.

On the centering of the variables in the interactions, you have given two
perfectly reasonable alternatives.  In one of them your published
statement would be "the effect of X1 when X2 is fixed at its population
mean value is Beta", while in the other you would say "the effect of X1 is
Beta when X2 fixed at M", where M is the complete data mean.  Both
statements are reasonable and unlikely to differ much.  I incline somewhat
toward the latter since the estimated (but unknown) population mean has no
particular importance while the effect at a fixed value that is close to
the mean is readily interpretable.





Reply via email to