On 8/16/06, Anon. <[EMAIL PROTECTED]> wrote:
>
> Geoffrey Poole wrote:
> > Sarah:
> >
> > I think the reviewer comment has merit.
> >
> > I understand your problem as follows:  Your goal is to compare the
> > "usefulness" (not sure what you means by "usefulness", but we'll go wit=
h
> > it...) of a regressions across environmental conditions.  However, unde=
r
> > one set of environmental conditions the regression might be based on 10
> > points, but under another set of conditions, it might be based on 100
> > points.
> >
> > Unfortunately, even under the SAME environmental conditions, the SE of
> > the slope will decrease as the sample size increases.  Thus, if the
> > number of points varies across environmental conditions, you don't know
> > if changes in the SE of the slope are caused by differences in sample
> > size or differences in "usefulness" across conditions.
> >
> > In section 17.3 "Testing the significance of a regression" of Zar's
> > "Biostatistical Analysis" (pages 334-5 of forth edition) there is a clu=
e
> > that might help you with your dilemma...
> >
> > Zar notes that the "standard error of estimate" (AKA "standard error of
> > the regression") is a measure of the remaining variance in Y *after*
> > taking into account the dependence of Y on X.
> Zar says that?  That's rubbish: the residual variance is the measure of
> the remaining variance in Y after taking into account the dependence of
> Y on X.



Just to clear things up a bit... The terms "standard error of the estimate"
or "standard error of the regression" are used in Zar to represent the sqrt
of the residual variance.  They are not meant to be the same as the standar=
d
error of the slope.  However, the standard error of the slope  =3D standard
error of the estimate / sqrt(SSx),  where SSx is the sum of squares of X.



> However, since the
> > magnitude of this value is proportional to the magnitude of the
> > dependent variable,
> Again, rubbish: add 20 000 to all of your Y's, and the variances will
> all be the same.  The only difference is that the estimated intercept is
> 20 000 higher.


Bob is definetly correct here.  I suspect that what Zar is referring to her=
e
is that the standard error of the estimate is in the same units as the
dependent variable.  Hence, you can divide it by the mean to get a
"unitless" measure.



I might now have understood the original problem (possibly...).
>
> I think the idea is that in any single environment, one can regress two
> variables and get a fit etc.  But the question is: how well will this
> fit do in another environment?  The (actual) slope will probably be
> different between environments, and the more different they are, the
> less use it is to use the slope in one environment to predict in
> another.  The problem is the variation between the slopes in the
> different environments: obviously we can measure this variation by the
> standard deviation (or the variance!).
>
> In practice, I would suggest fitting a mixed model, where you allow the
> slope to vary randomly between environments.  Any decent stats package
> can do this: I think some people call them random regressions.  This
> will estimate the variation in slopes between environments, allowing for
> any differences in sample sizes in the different environments.  If the
> variance is small, then the predictions from one environment to another
> will be pretty good (obviously this depends a bit on the size of the
> regression coefficient: if it's zero, then there's no improvement anyway)=
.

I'll have to think a bit more about the best way of evaluating the
> importance of the variation in the slopes: the intuition is to ask how
> much better you do at predicting the value of a data point if you know
> which environment it was measured in, as compared to if it's a random
> environment.  Something similar to an intraclass correlation could be
> used.
>
> Incidentally, this is perhaps a good opportunity to plug this book:
> <
> http://www.stat.columbia.edu/~cook/movabletype/archives/2006/08/our_new_b=
ook_da.html
> >
> I read a draft in the spring and can heartily recommend it.  It covers
> the family of models that can be used for most statistical analyses I
> see in ecology (including the problem here!), in a practical way.

And now to bed.
>
> Bob
>
> --
> Bob O'Hara
> Department of Mathematics and Statistics
> P.O. Box 68 (Gustaf H=E4llstr=F6min katu 2b)
> FIN-00014 University of Helsinki
> Finland
>
> Telephone: +358-9-191 51479
> Mobile: +358 50 599 0540
> Fax:  +358-9-191 51400
> WWW:  http://www.RNI.Helsinki.FI/~boh/
> Journal of Negative Results - EEB: www.jnr-eeb.org
>


If Bob has indeed described the context of your question correctly, I would
second his suggestion of using a mixed model.  However, I realize nobody ha=
s
directly answered your original question.  I am not aware of a "standard
deviation of the slope" that is different from what most would call the
standard error of the slope.  A quick google of the term "standard deviatio=
n
of the slope" revealed that when the term is used - it is usually defined
using some form of a formula which is the se.  Perhaps your reviewer knows
of something else.

Reply via email to