> 
> mike wrote:
> 
>> The error terms in the regression model are required to have normal
>> distributions with constant variance.  I understand how to test for
>> normality in SAS, but how do you test for homogeneity of variances in SAS?
>> Do you test the residuals or the orginial data for homogeneity of variance?

and Graeme replied:
> 
> The residuals since the assumptions rest on them. You can view the model as a
> filter that removes the systematic (explainable) variation in the original
> data.
> If the model is appropriate you should be left with a set of independent and
> identically distributed (IID) residuals. I'm not sure about SAS procedure but
> look for Levene's test or, if the normaility assumption is satisfied,
> Bartlett's
> test may be appropriate. There are probably other homogeneity of variance
> tests
> available in SAS as well.
> 

The residuals are NOT independent.  There are n of them but they have only
n - k - 1 df (where k is number of predictors) so they are necessarily
correlated.  Not a big deal other than it makes formal tests of their
properties complicated.

In any case, one is usually better off with simple graphical methods
because:
    a) if it is a big enough problem to cause trouble with the regression,
it is usually easy to see it in the diagnostic graphcs
    b) the formal methods are sensitive to violations that are not
problematic (e.g., thicker tails than the normal is a problem but thinner
tails is not--a uniform distribution fails the formal normality tests but
isn't problematical as a distribution of the residuals)
    c) the formal tests are themselves often even less robust to assumption
violations than the regression model (e.g., Bartlett's test, as Graeme
suggests above, depends heavlily on the normality assumption).


Usually two diagnostic plots will tell you a lot:

1.  normal quantile plot.  (PROC UNIVARIATE will produce this if asked)
look for steep sections of the plot at either end.  deviations from the line
in the middle are almost never worth worrying about

2.  some plot of the residuals against the predicted values.  many suggest
plotting Sqrt[Abs[Residual]] against Yhat.  any systematic trend of the
residuals to get larger (the usual heteroscedasticity problem) as Yhat
increases are easy to detect.  if it looks like a cloud of points, then
don't worry.

a good reference is

Atkinson, A.C. (1985).  Plots, transformations, and regression: An
introduction to graphical methods of diagnostic regression analysis.
Oxford.

he explains how to make the plots, what they mean, and then how to go about
fixing the problems.  his advice is much more useful in my experience than
formal tests of normality and homogeneity of variance.

gary
[EMAIL PROTECTED]



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to