Re: transformation of dependent variable in regression

2002-01-16 Thread Dennis Roberts

there is nothing from stopping you (is there?) trying several methods that 
are seen as sensible possibilities ... and seeing what happens?

of course, you might find a transformational algorithm that works BEST (of 
those you try) with the data you have but ... A) that still might be an 
"optimal" solution and/or B) it might be "best" with THIS data set but ... 
for other similar data sets it might not be

i think the first hurdle you have to hop over is ... does it make ANY sense 
WHATSOEVER to take the data you have collected (or received) and change the 
numbers from what they were in the first place? if the answer is YES to 
that then my A and B comments seem to apply but, if the answer is NO ... 
then neither A nor B seem justifiable

with 2 independent and 1 dependent variables... you have possibilities for 
transforming 0 of them ... 1 of them ... 2 of them ... or all of them and, 
these various combinations of what you do might clearly produce varying results




_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: transformation of dependent variable in regression

2002-01-16 Thread Robert J. MacG. Dawson



"Case, Brad" wrote:
> 
> > Hello.  I am hoping that my question can be answered by a statistical
> > expert out there!! (which I am not).  I am carrying out a multiple linear
> > regression with two independents.  It seems that a square root
> > transformation of the dependent variable effectively decreases
> > heterocscedasticity and "linearises" the data.  However, from what I have
> > read, transformations of the dependent variable introduces a bias into the
> > regression, producing improper estimates after back-transforming to "real"
> > units.  Does anybody out there have any knowledge of this problem, or have
> > a strategy for correcting for this type of bias?  Any help would be much
> > appreciated.  Thanks.

It depends on what you mean by a bias.

The OLS regression line minimizes a certain measure of badness-of-fit
over all linear fits to the data. If the dependent variable is
transformed, OLS fitting is done, and the data and line are transformed
back, the new fit will *not* be optimal by that criterion.

On the othe rhand, it will be optimal by some different criterion.  The
question is, which criterion do you want and why?  The answer "because
all the other researchers are using it" is not adequate. If all the
other researchers jumped off a cliff, etc, etc  Nor is the
rhetorically loaded word "bias" a reason to avoid using a method.
Technically, it just means that the curve given isn't what another
method would have given.

The usual *informed* reason for OLS fitting is that with a
homoscedastic normal error model it is the maximum-likelihood estimate
for the parameters of the line. If your data do not support such an
error model, then that reason doesn't apply.

If after transforming the dependent variable the data *do* fit a
homoscedastic normal error model, then within that family of conditional
distributions the maximum likelihood choice *is* the one obtained by OLS
fitting to the transformed data. In other words, the reason that often
justifies OLS fitting justifies, in this case, precisely the transformed
fit that you obtained.

So if your transformed data fit the criteria for OLS fitting, fit them
and transform back, and don't worry about "bias". 

-Robert Dawson


=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=