Hi

On Thu, 24 Jan 2002, Rich Ulrich wrote:

> On 24 Jan 2002 07:09:23 -0800, [EMAIL PROTECTED] (Rich Einsporn)
> wrote:
> > Jim Clark gave a fine answer to the question posed by Sangdon Lee.
> > However, I am curious about the correlation and R-square figures given by
> > Sangdon.  Apparently, the R-squares for the simple linear regressions on
> > X1 and X2 are (-.2)^2 = .04 and (.3)^2 = .09, but Sangdon says that the
> > R-sq for the multiple regression is "ONLY" 0.3.  I find this to be
> > surprisingly high, not low.  In the examples I see, the R-sq for the
> > combined model is at most the sum of the individual R-squares. Is it even
> > possible for the opposite to occur?
> 
> "Is it possible?"  Certainly.  The predictors have to be correlated 
> for that to happen; and these were, at 0.6.  (Plug all the r's  into
> the equation for multiple-R, and you can check his total.  I did not
> check because it looked feasible, to my eyeball.)
> 
> "Confounding".  It is more common for two predictors to be 
> highly correlated, and share their prediction variance, so 
> that the total R^2  is barely more than either one alone.  
> But these two variables, correlated 0.6 (which is pretty high),
> predict in opposite directions; so their joint prediction will be
> greater than the sum.

An example of the confounding that I use in class is the
relationship between study time and grades, which tends to be
weak unless intelligence is also included as a predictor.  The
effect occurs because higher intelligence tends to be associated
with LESS studying but HIGHER grades, which masks the positive
effects of studying.  The robust effect of studying emerges in a
multiple regression with both study time and intelligence (the
effect of intelligence is also strengthened), leading to a total
R^2 greater than the sum of the simple r^2s.  I'm not sure if the
terminology is standard, but I talk about the variables "masking"
each other's influence in the simple regressions/correlations.  
This effect for grades has been known at least since a classic
education study by May (1927 or so).  Ignorance of the complexity
led a Canadian newspaper quite a few years ago to report in big
bold letters "Want to get good grades? Don't study!" based of
course on the simple (weak) relationship between study time and
grades.

You can also see the point that Rich is making by thinking about
the quantity ry1 - ry2*r12, which appears as the numerator in a
number of multiple regression formula (slopes, R, betas).  
Clearly the "true" (i.e., adjusted) relationship between y and x1
can be increased, decreased, or unaffected by controlling
statistically for the y-x2, x1-x2 relationships, which is what
the above formula does.

Best wishes
Jim

============================================================================
James M. Clark                          (204) 786-9757
Department of Psychology                (204) 774-4134 Fax
University of Winnipeg                  4L05D
Winnipeg, Manitoba  R3B 2E9             [EMAIL PROTECTED]
CANADA                                  http://www.uwinnipeg.ca/~clark
============================================================================



=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to