Actually, a more complete description of the situation would be this:
X is one of the variables that affects Y,
Y = a0 + Xa1 + ... + error
but Y is one of the variables that affects an indicator Z,
Pr (Z=1) = F ( b0 + Yb1 + ...)
and if Z=1 then I definitely can't observe X.
I suspect the correct procedure in this situation is the same as in the
simplified situation I described originally.
But I thought I should be thorough.
Thanks again,
Paul von Hippel
>On Thu, 16 May 2002, Paul von Hippel wrote:
>
> > Here's a missing-data situation that I haven't run into before. X is
> one of
> > the variables that affects Y,
> > Y = a0 + X1a1 + ... + error
> > but Y is one of the variables that affects whether I have information on X,
> > Pr (X missing) = F ( b0 + Yb1 + ...)
> > Here F is a cumulative distribution function -- for example, normal or
> > logistic.
> >
> > I want to make efficient, unbiased estimates of the first equation's
> > regression parameters a_i.
> >
> > Any suggestions most welcome.
> >
> > Many thanks,
> > Paul von Hippel
> > Ohio State University