Hi all,

I have some questions for imputation under logical constraints:

I am multiply imputing missing values for variables from an establishment 
survey using sequential regression.

Now  let's start with an easy case: I have to make sure that the condition 
y1<=y2 always holds for my imputed values. In this case, I compute y1 as the 
fraction of y2 for the observed part of the data and impute these fractions 
instead of the real values and whenever my imputed values are outside the 
bounds [0%;100%], I simply redraw the value for this observation until the 
condition is fulfilled.

Any other ideas how to do that?


It gets more difficult if I have to make sure that the condition 
y.total=y1+y2+y3 is fulfilled. If I just impute y1,y2, and y3 and then simply 
define y.total=y1+y2+y3 I expect that I will overestimate the total number. 
Another idea would be to impute all the variables independently and then 
downweight y1, y2 and y3 to make sure that the above condition is fulfilled. 
But I find neither of the two ideas to be satisfying. 

Are there other ways to do it?


Things start to get real funny, if the above conditions also have to be 
fulfilled for subpopulations. Say y.total is the total number of employees and 
y1,y2,and y3 are number of employees for different levels of qualification. 
What if the question is: How many of these employees are females?

Then I have to make sure that   y.total=y1+y2+y3 
                                y.total.f=y1.f+y2.f+y3.f
                                y.total.f<=y.total
                                y1.f<=y1
                                y2.f<=y2
                                y3.f<=y3


I am in real trouble here and any ideas or comments are highly appreciated.


Joerg

Institute for Employment Research
Nuremberg, Germany

Reply via email to