Dear Tipsters,

I forward a nice technical response from my statistical colleague, 
Dale Stout.

Stuart

Question:


Wee thing that's bothering me: If in ANOVA, the F-ratio 
              is 

                (error variance + treatment variance)/error variance  
              how can F < 1?

This goes back into understanding that a variance based on a small
number will tend to be smaller than a variance based on a large number
of members.  So we GENERALLY expect that a variance based on three
scores to be smaller than a variance based on 25 scores (in fact this
relates to the last question we talked about on S).  Now the EXPECTED
VALUE of F is EQUAL TO 1 (!!!!).  Thats because VARIANCE is a an
UNBIASED ESTIMATOR, therefore, REGARDLESS of  sample size the expected
value of the variance is unaffected.  So when Ho IS TRUE, the ratio
[error variance + treatment variance]/error variance has to equal 1,
because expected values are equal even though the information upon
which the two variance estimates are based are different.  That is the
numerator is based on a variance estimate that is derived from first
estimating the variance of the SAMPLING DISTRIBUTION OF MEANS (using
three Means in a standard simple oneway ANOVA type example - thus the
variance estimate of the sampling distribution is based on a few
members - 3 ) and then we multiply this estimate by n (sample size),
because we know that variance of a sampling distribution is n-times
less variable than scores.   The denomenator, 'error variance', is
based on a pooling of the group scores (within group estimate -  which
makes for three estimates based on n scores in each group).  So if Ho
is true, then the EXPECTED VALUES of the numerator and the denomenator
will be equal.     ..................... BUT your question is, why are
we likely to get an F less than one.  Well, because we are not talking
about EXPECTED VALUES, but ACTUAL estimates, we realize that the
numerator is based on only 3 members of the population of means (using
my example of a one-way above), thus that variance estimate is going
to tend to be SMALL.  When we multiply this by n, we are multiplying
an underestimate to get an estimate of the population variance, which
will make it a small estimate ( that is because means are less
variable than scores by a factor of 'n', and given that our estimate
of the sampling distribution of means is small, we get a small
estimate for the variance of scores.).    The Denomenator (Within
Group Estimate) is based on more information, more members, therefore
as an estimate it will not only be 'better', but it will likely be
larger (say we have 20 people in each group, we are basing our
estimate on pooling variances based on 20 bits of information each.) 
Thus, in practice the F we OBTAIN will be smaller because the
numerator tends to be smaller than the denomenator... and this all
because of the nature of sampling & the sample size upon which
estimates are based.  If you want to talk about this, I would be happy
to show you how it all works - that the between and within group
estimates are INDEPENDENT estimates of the same variance IF Ho IS
TRUE.
      SO, IN OTHER WORDS, THE SAMPLING DISTRIBUTION OF THE F 
STATISTIC IS POSITIVELY SKEWED.  ACCORDING TO THE CENTRAL LIMIT 
THEOREM, THIS SAMPLING DISTRIBUTION WILL BECOME NORMAL IF n is 
sufficiently large.  In this case we would have to increase the 
number of means used in making the Between group estimate.  So, for
all intents and purposes, the F-distribution should be thought of as
positively skewed, with a mean of 1, if Ho is true.  This means that
most F values are less than 1.  

You asked: "Would you consider the F-test to be one-tailed?"  YES. 
Because it is almost alway positively skewed,  and F has to be greated
than 1 to be significant..... anything less is generally regarded as
being what is most frequent.  We could argue that we should establish
a tail for really small F's - that it is not likely to keep getting
F's that are near zero by sampling alone.  Some argue that if we are
showing that a treatment produces group means are the same, exactly or
close to exactly the same, thus F's will be near zero - we could
calculate such probabilities for a two tailed test.  But these kinds
of test are very infrequent in our literature, but certainly possible.


I hope this all helps and makes ANOVA clear.... at least its logic.

Cheers,
Dale (LZS)   









Bishop's University Psychology Department Web Page:
http://www.ubishops.ca/ccc/div/soc/psy
___________________________________________________

Dale Stout
Box 5
Psychology Department
Bishop's University
Lennoxville, Quebec
J1M 1Z7
Phone (819) 822 - 9600  Ext: 2440
Fax   (819) 822 - 9661

> Date:          Mon, 16 Oct 2000 11:38:22 -0500
> From:          Mike Scoles <[EMAIL PROTECTED]>
> To:            [EMAIL PROTECTED]
> Subject:       Re: F-ratio - can anyone provide a simple explanation?

> Antoinette -
> 
> I am not sure that there is a simple explanation, but Fs that are much
> smaller than one can be an indicator of the wrong analysis for the
> experimental design.  The analysis that is described works for a completely
> randomized design.  If this analysis is used for a randomized-blocks or
> matched-groups design, Fs less that one are likely to occur.  Thinking about
> what happens to the error in those designs may help.
> 
> - Mike
> --
> "A.Hardy" wrote:
> 
> > The basic structure of the F-ratio for ANOVA is:
> > variance between treatment/variance within treatment
> > The source of the variance for these two components is:
> > Between treatment variance = 'Treatment Effect' + 'Individual
> > Differences' + 'Experimental Error';
> > Within treatment variance = 'Individual Differences' + 'Experimental
> > Error';
> > Therefore if there is no treatment effect the F-ratio will be around
> > 1.00 because the numerator & denominator are both measuring the same
> > variance ('Individual Differences' + 'Experimental Error').
> >
> > So far so good, but IF the variance WITHIN treatments is great and the
> > variance between treatments is negligible the F-ratio falls well below
> > 1.00 (I have some data which has resulted in the F-ratio being 0.01).
> >
> > I am trying to find a simple explanation for this that I can give to my
> > students, can anyone help?
> 
> ********* http://www.coe.uca.edu/psych/scoles/index.html ********
> * Mike Scoles                       *    [EMAIL PROTECTED]  *
> * Department of Psychology          *    voice: (501) 450-5418  *
> * University of Central Arkansas    *    fax:   (501) 450-5424  *
> * Conway, AR    72035-0001          *                           *
> *****************************************************************
> 
> 
> 

___________________________________________________
Stuart J. McKelvie, Ph.D.,                Phone: (819)822-9600
Department of Psychology,                 Extension 2402
Bishop's University,                      Fax: (819)822-9661
3 Route 108 East,
Lennoxville,                              e-mail: [EMAIL PROTECTED]
Quebec J1M 1Z7,
Canada.

Bishop's University Psychology Department Web Page:
http://www.ubishops.ca/ccc/div/soc/psy
___________________________________________________

Reply via email to