On Tue, 2 May 2000, Alan McLean wrote:

> 'No collinearity' *means* the X variables are uncorrelated!

This is not my understanding.  "Uncorrelated" means that the correlation 
between two variables is zero, or that the intercorrelations among 
several variables are all zero.   "Not collinear" means that there is not 
a linear dependency lurking among the variables (or some subset of them). 
"Uncorrelated" is a much stronger condition than "not collinear".

> The basic OLS method assumes the variables are uncorrelated 
> (as you say). 

Not as presented in, e.g., Draper & Smith;  who go to some trouble to 
show how one can produce from a set of correlated variables a set of 
orthogonal (= mutually uncorrelated) variables, and remark on the 
advantages that accrue if the X-matrix is orthogonal.  But it is clear 
that they expect predictors to be correlated as a general rule.

> In practice there is usually some correlation, but the estimates are 
> reasonably robust to this.  If there is *substantial* collinearity you 
> are in trouble.

If there is collinearity _at_all_ you are in trouble;  further, if the 
correlations among some of the predictors are high enough (= close enough 
to unity), a computing system with finite precision may be unable to 
detect the difference between a set of variables that are technically not 
collinear but are highly correlated, and a set of variables that _are_ 
collinear.  (E.g., X and X^4 are not collinear;  but if the range of X 
in the data is, say, 101 to 110, a plot of X^4 vs X will look very much 
like a straight line.)  For this reason various safety features are 
usually built in to regression programs:  variables whose tolerance value 
with respect to the other predictors is lower than a certain threshold 
(or whose variance inflation factor -- the reciprocal of tolerance -- is 
above a corresponding threshold) are usually excluded from an analysis; 
although it is often possible to override the system defaults if one 
thinks it necessary.  The existence of such defaults is clear evidence 
that at least the persons responsible for system packages expected that 
variables would often have substantial intercorrelations.

And if it were a requirement (= assumption) that predictors be 
uncorrelated, it would not be necessary to worry about inverting a pxp 
submatrix of predictors:  the simple linear regression coefficient for 
predicting Y from X_j alone would be unaffected by the presence of other 
predictors in the model.
                                -- Don.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to