On Fri, 29 Oct 2004 14:50:26 -0400, Paul von Hippel <[email protected]>  
wrote:

> Here's a simplified version of my problem. I'd be grateful for any  
> suggestions.
>
> Say X1 is a dummy variable with missing values, and I intend to regress  
> Y on X1, X2, and X1X2. The usual and sound advice is that both X1 and  
> X1X2 need to be imputed.
>
> But in imputing X1 and X1X2 I run into collinearity problems because X1  
> explains 99% of the variation in X1X2. The reason is that X1X2 has no  
> residual variation when X1=0. X1X2 does have substantial residual  
> variation when X1=1, but cases with X1=1 make up only 15% of the data  
> set.

It sounds like you're treating X1 as continuous in the imputation (which  
is what I usually do, as well).  If so, you can mean-center X1 (and X2)  
before creating the product term.  That should help quite a lot.

Pat

-- 
Patrick S. Malone, Ph.D., Research Scientist
Duke University Center for Child and Family Policy
North Carolina, USA
http://www.duke.edu/~malone
http://www.pubpol.duke.edu/centers/child/

Reply via email to