Under certain circumstances, I'm finding that my imputation model imputes 
outlying values. I'm not sure whether this problem is peculiar to the 
software I am using (IVEware), or whether similar problems would be 
expected from any software. Details follow.

I'm imputing test scores, demographics, and other variables for ~5000 
students clustered in ~300 schools. To account for the clustering, I am 
including the school ID variable in the imputation model.

In a few schools, all students are missing scores for a fall reading test. 
In those schools, IVEware imputes the same score for each student. 
Typically the imputed score is one of the boundary values that I have 
imposed. If no boundary values are imposed, then the imputed scores are 
impossibly high or low.

Under these circumstances, the effect of the school on fall reading scores 
cannot be estimated directly. It appears that the program is responding to 
this situation by assuming the school effect is very large, and ignoring or 
swamping the predictive value of other observed variables, such as the 
spring reading test.

I wonder how I can get more plausible imputations from the model.

Best wishes,
Paul von Hippel

Paul von Hippel
Statistician
Department of Sociology / Initiative in Population Research
Ohio State University 

Reply via email to