On Sat, 20 Mar 2004 19:19:22 +0000 (UTC), [EMAIL PROTECTED] wrote:

> Eugene Gallagher <[EMAIL PROTECTED]> wrote:
> > Rich,
> >  I tend to agree with you about the potential abuse of stepwise 
> > multiple regression. However, it is widely used and I wouldn't label a 
> > study using stepwise as being of necessity flawed, even if the goal was 
> > to evaluate the relative importance of different explanatory variables. 
> >  For example, this week's Science has an article that has been widely 
> > reported in the popular press and the key analysis is a stepwise 
> > regression. One way of interpreting the paper is that the authors used 
> > stepwise to make their assessment of the importance of N deposition more 
> > objective. They didn't pick N deposition, the computer did.
> > 
Mike Babyak > 
> Not having seen the paper, I can't comment on the specifics of the application of 
> stewpise
> there.  But certainly, given the simulation literature on stepwise, the fact that 
> Science
> published such a paper only shows that they aren't aware of the problems with the 
> procedure.  
> The good intent or reputation of a journal or scientist still won't make the 
> procedure
> better.  In most situations scientists encounter, there isn't "potential abuse", 
> there's just 
> pretty much by definition a badly overfitted model.
> 

Mike says that well.
Still, I might be a *little*  less harsh.
I haven't see the paper, either.

Here is some more of what Gene posted --
"Of 20 variables measured to account for the variability in species 
richness, total deposition of inorganic N (Ndep, kg N ha�1 y�1) was
the most important predictor, explaining more than half of the
variation in the number of species per quadrat (Fig. 2A and Eq. 1)....
"  After accounting for N deposition, mean annual precipitation (MAP,
mm) explained an additional 8% of variability in species richness. A
further 5% was explained by the A horizon soil pH (Top pH, Fig. 2B)
and 3% by altitude (Alt, m). In total, 70% of the variability in
species richness could be explained by these four variables: ... "


Stepwise, I have said before, can give you a shorter list
of variables when you have a list where everything matters.

Especially, it can give you the *first*  variable, if one of them
stands out from the others.  In the above, Deposition does
account for a huge share of variance;  what is unstated (here,
at least) is whether any of the other (presumably correlated)
measures were anywhere close to that fraction, univariate.

Stepwise if *famous*  for being really lousy at giving you
the number two and three and four when the relative shares 
of Variance are (for instance) 54, 8, 5, and 3.  If they were
searching  for 'explanation'  rather than a shorter prediction
equation, then the authors stumbled badly -- if the stepwise
result is all they relied on.  Again,  I have not see the paper, 
so I want my aspersions  to be read as being somewhat 
hypothetical, or as being cast against the worst-case scenario.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
 - I need a new job, after March 31.  Openings? -
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to