Here's a missing-data situation that I haven't run into before. X is one of
the variables that affects Y,
Y = a0 + X1a1 + ... + error
but Y is one of the variables that affects whether I have information on X,
Pr (X missing) = F ( b0 + Yb1 + ...)
Here F is a cumulative distribution function -- for example, normal or
logistic.
I want to make efficient, unbiased estimates of the first equation's
regression parameters a_i.
Any suggestions most welcome.
Many thanks,
Paul von Hippel
Ohio State University