Re: no correlation assumption among X's in MLR

Donald F. Burrill Fri, 05 May 2000 12:57:46 -0700
On Wed, 3 May 2000, Alan McLean wrote in part:

> With regard to correlation and collinearity - I have become used to 
> 'explaining' collinearity to my classes in terms only of pairs of 
> explanatory variables, forgetting that the collinearity could involve a 
> set of three or more variables, and this 'pair-wise no collinearity' is, 
> as I understand it, equivalent to 'no linear correlation'. 

As I understand it, "no linear correlation" means "r = 0" for two 
variables, or  "R = I"  for the matrix R of intercorrelations among a 
larger number of variables.  I do not really understand (nor recognize) 
your "pair-wise no collinearity".  I understand "collinear" to mean (as 
Herman Rubin explained rather well) that "among the variables being 
considered there exists a linear dependency", a "linear dependency" 
being an ability to express one of the variables as a linear function of 
(some or all of) the other variables without error.

> This suggests, incidentally, that 'not collinear' is stronger than 
> 'uncorrelated' (not *linearly* correlated) which doesn't agree with 
> your statement - is this so? It also suggests that 'collinearity' 
> means more than just 'correlated'. 

Well, you can't have both.  If "collinear" is a stronger condition than 
"correlated", as I understand is the case, then "not collinear" must be 
weaker than "uncorrelated".  As I have described above:  r = 0, or R = I, 
is a stronger condition than "not collinear", which permits r's different 
from 0 in the off-diagonal elelments of the R matrix, so long as the 
pattern of corelations does not imply linear dependency(ies).

> A useful way of picturing the situation is that each variable 
> corresponds to an axis, the angles between the axes determined by the 
> correlation coefficient.  (I think, very uncertainly, that the 
> correlation coefficient is the cosine of the angle.) 

I have heard of such a representation.  My own visualization skills are 
not really up to this, for I do not know how to visualize more than three 
dimensions, and can therefore handle this sort of idea only up to three 
variables.  Interesting problems seem invariably to involve rather more 
than three variables.
> If variables are uncorrelated, the axes are orthogonal;  if they are 
> perfectly correlated, the axes are identical.  If there is a linear 
> combination between variables, the corresponding dimensions collapse to 
> a 'plane'.  (This is all happening in k dimensions.)  This corresponds 
> to the matrix X'X having rank less than k (for k variables) so leads (as 
> I understand it) to the collinearity problem.

This is in accord with my understanding;  though I would have said 
"hyperplane".

> In terms of the data, there is unlikely to be total collapse (just as a
> sample correlation of exactly zero is highly unlikely) but you might 
> get near collapse. 

Oh, indeed.  Consider the correlation between X and X^4 when restricted to 
100 < X < 120, for example.

> For only two variables highly correlated, the axes are nearly  
> indistinguishable;  for three variables you will get a very low hill 
> (this is difficult to describe!).  
                                        Indeed.  ;-)

> The problem then is to decide whether or not to exclude variables - is 
> the hill high enough to count as three variables, or so low that one 
> variable should be excluded?
 
Determined in real computer programs by the "tolerance" of each variable, 
which I believe to be something like (1 - R^2) where R is the multiple 
correlation for predicting this variable from all the other predictors. 
Sometimes reported as a "variance inflation factor" ("VIF"), which is the 
reciprocal of tolerance.  For tolerance < some threshold near 0, the 
variable will be excluded.  Default value of tolerance, like some other 
default values, can be altered if needed.

In this connection you may be interested in my paper on detecting 
and interpreting interactions in multiple regression, which is one of the 
"White Papers" on the Minitab website:
        (http://www.minitab.com/resources/whitepapers)
Four predictors and all interactions led to apparent multicollinearity 
(what I call "spurious multicollinearity", since it's due to the range of 
each variable being restricted and bounded away from 0, not to the 
logical relationships among the variables);  you can see one of the 
predictors being excluded because its tolerance in combination with all 
the others was too low, and the exclusion being overruled by resetting 
the default tolerance threshold to a (ridiculously) lower level. 
You can also see the effect of removing the spurious multicollinearity by 
orthogonalizing the constructed variables.

> I think I stand by my original observation, that *in the data* there is 
> always some evidence of collinearity/correlation; 

Of correlation; not necessarily of collinearity.

> if this evidence is strong enough you have to reduce it by reselecting 
> the variables.
                        One certainly has to reduce it.  "Reselecting" 
(which includes, I presume, some elimination of variables) is one way to 
do so;  orthogonalizing is another.  Other approaches are sometimes used, 
such as re-expressing varibles as deviations from their means;  this I 
take to be an approximate approach to orthogonalizing, although it is not 
always -- in fact, VERY seldom! -- recognized as such.

> In your third paragraph you seem to be identifying collinearity with
> correlation - more precisely, that the problems with collinearity are 
> those of correlation - and to a large extent identifying 'the trouble' 
> that I spoke of. 

No.  I was identifying a problem in computation with finite precision, 
that one may not be able to distinguish between a pattern of correlations 
that happen to be inconveniently (and perhaps misleadingly) large, and a 
pattern of correlations reflecting an exact linear dependency (that is, a 
collinearity) in the predictors.  The _problem_ is with collinearity;  
the _evidence_ of possible collinearity lies in the R matrix.  It is not 
correct to _identify_ collinearity with correlation (or vice versa, if 
you deem the verb to be non-commutative!).
                                                -- Don.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================
Re: no correlation assumption among X's in MLR

Reply via email to