Geoffrey Poole wrote:
 >> Zar notes that the "standard error of estimate" (AKA "standard error
 >> of the regression") is a measure of the remaining variance in Y
 >> *after* taking into account the dependence of Y on X.

Bob O'Hara wrote:
 > Zar says that?  That's rubbish: the residual variance is the measure
 > of the remaining variance in Y after taking into account the
 > dependence of Y on X.

The way I read Zar, he starts with the regression residual sum of 
squares and divides degrees of freedom, which yields the variance of the 
residuals (to which you refer).  If you take the square root of this 
value, you get what Zar refers to as the "standard error of estimate."

I suppose I was not careful in my wording when I called this statistic a 
"measure of variance."  I should have said a "measure of variation."

Geoffrey Poole wrote:
 >> However, since the magnitude of this value is proportional to the
 >> magnitude of the dependent variable...

Bob O'Hara wrote:
 > Again, rubbish: add 20 000 to all of your Y's, and the variances will
 > all be the same.  The only difference is that the estimated intercept
 > is 20 000 higher.

Yes, adding a constant to a distribution will not change the variance. 
In thinking about it, it does seem confusing for Zar to state: "The 
magnitude of [the 'standard error of estimate'] is proportional to the 
magnitude of the dependent variable, Y." (top of page 335, Fourth 
Edition). But before we dismiss Zar (and Dapson) as rubbish, let's 
consider real-world data that represent biological phenomena rather than 
purely contrived data (e.g., adding a constant to all Y values).

Consider the weight of animals, for instance.  The variance in weight 
for a large-bodied species (say, humans) is much higher than for mice, 
and higher for mice than fleas.  Even within a single species (again, 
e.g., humans), the variance in weight among adults is far greater than 
among infants.  When considering regressions that predict the weight of 
individuals, then, it follows that the residuals of regressions are apt 
to increase in proportion to the average weight of individuals in the 
population.

Thus, couldn't biological factors (rather than any underlying 
mathematical formulation) drive a relationship between the "standard 
error of estimate" and the mean of the dependent variable?

-Geoff Poole

Reply via email to