Here's a simplified version of my problem. I'd be grateful for any suggestions.

Say X1 is a dummy variable with missing values, and I intend to regress Y 
on X1, X2, and X1X2. The usual and sound advice is that both X1 and X1X2 
need to be imputed.

But in imputing X1 and X1X2 I run into collinearity problems because X1 
explains 99% of the variation in X1X2. The reason is that X1X2 has no 
residual variation when X1=0. X1X2 does have substantial residual variation 
when X1=1, but cases with X1=1 make up only 15% of the data set.

I've used two imputation programs -- MI and IVEware -- and neither can 
handle the collinearity. MI gives an error message and IVEware gives 
implausible imputations.

Again, I'd be grateful for any suggestions.

Thanks!
Paul von Hippel



Reply via email to