On Fri, 29 Oct 2004 14:50:26 -0400, Paul von Hippel <[email protected]> wrote:
> Here's a simplified version of my problem. I'd be grateful for any > suggestions. > > Say X1 is a dummy variable with missing values, and I intend to regress > Y on X1, X2, and X1X2. The usual and sound advice is that both X1 and > X1X2 need to be imputed. > > But in imputing X1 and X1X2 I run into collinearity problems because X1 > explains 99% of the variation in X1X2. The reason is that X1X2 has no > residual variation when X1=0. X1X2 does have substantial residual > variation when X1=1, but cases with X1=1 make up only 15% of the data > set. It sounds like you're treating X1 as continuous in the imputation (which is what I usually do, as well). If so, you can mean-center X1 (and X2) before creating the product term. That should help quite a lot. Pat -- Patrick S. Malone, Ph.D., Research Scientist Duke University Center for Child and Family Policy North Carolina, USA http://www.duke.edu/~malone http://www.pubpol.duke.edu/centers/child/
