On Fri, 24 Mar 2000, Bernard Higgins wrote:

> 
> 
> Hi Bruce

Hello Bernard.

> 
> The point I was making is that when developing hypothesis tests, 
> from a theoretical point of view, the sampling distribution of the 
> test statistic from which critical values or p-values etc are 
> obtained, is determined by the null hypothesis. We need a probability 
> model to enable use to determine how likely observed patterns are. 
> These probability models will often work well in practice even if we 
> relax the usual assumptions. When using distribution-free tests as 
> an alternative to a parametric test we may need to specify 
> restrictions in order that the tests can be considered "equivalent". 

Agreed.

> 
> In my view the t-test is fairly robust and will work well in most 
> situations where the distribution is not too skewed, and constant 
> variance is reasonable. Indeed I have no problems in using it for the 
> majority of problems. When comparing two independent samples using 
> t-tests, lack of normality and constant variance are often not too 
> serious if the samples are of similar size, always a good idea in 
> planned experiments.

Agreed here too.

> 
> As you say, when samples are fairly large, some say 30+ or even 
> less, the sampling distribution of the mean can often be approximated 
> by a normal distribution (Central Limit Theorem) and hence the use of 
> an (asymptotic) Z-test is frequently used. It would not, I think, be 
> strictly correct to call such a statistic t, although from a 
> practical point of view there may be little difference. The formal 
> definition of the single sample t-test is derived from the ratio of a 
> Standard Normal random variable to a Chi-squared random variable and 
> does, in theory, require independent observations from a normal 
> distribution.


I think we are no longer in complete agreement here.  I am not a 
mathematician, but for what it's worth, here is my understanding of t- 
and z-tests:

        numerator = (statistic - parameter|H0)
        denominator = SE(statistic)

        test statistic = z if SE(statistic) is based on pop. SD
        test statistic = t if SE(statistic) is based on sample SD

The most common 'statistics' in the numerator are Xbar and (Xbar1 - 
Xbar2); but others are certainly possible (e.g., for large-sample 
versions of rank-based tests).

An assumption of both tests is that the statistic in the numerator has a
sampling distribution that is normal.  This is where the CLT comes into
play:  It lays out the conditions under which the sampling distribution of
the statistic is approximately normal--and those conditions can vary
depending on what statistic you're talking about.  But having a normal
sampling distribution does not mean that we can or should use a critical
z-value rather than a critical t when the population variance is unknown
(which is what I thought you were suggesting).  

As you say, one can substitute critical z for critical t when n gets
larger, because the differences become negligible.  But nowadays, most of
us are using computer programs that give us more or less exact p-values
anyway, so this is less of an issue than it once was. 


Cheers,
Bruce
-- 
Bruce Weaver
[EMAIL PROTECTED]
http://www.angelfire.com/wv/bwhomedir/




===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to