(This note is largely in support of points made by Rich Ulrich and
Paul Swank.)

I disagree with the claim (expressed in several recent postings) that
z-tests are in general superseded by t-tests.  The t-test (in simple
one-sample problems) is developed under the assumption that independent
observations are drawn from a normal distribution (and hence the mean and
sample SD are independent and have specific distributional forms).
It is widely applicable because it is fairly robust against violations
of this assumptions.

However, there are also situations in which the t-test is clearly 
inferior to a z-test.  Consider first a set of measurements taken with
a measuring instrument whose sampling errors have a known standard
deviation (and approximately normal distribution).  In this case, with
a few observations (let's say 1 or 2, if you want to make it very clear),
the z-based procedure that uses the known SD will give much more useful
tests or intervals than a t-based procedure (which estimates the SD from
the data at hand).

Now consider estimation of a proportion.  Using the information that the
data consist only of 0's and 1's, and an approximate value of the
proportion, we can calculate an approximate standard error more
accurately (for p near 1/2) than we could without this information.  The
interval based on the usual variance formula p(1-p) and the z
distribution is therefore better than the one based on the t
distribution.  This is why (as Paul pointed out) everybody uses z
tests in comparing proportions, not t tests.  The same applies to
generalizations of tests of proportions as in logistic regression.

On the pedagogical issue, if you want to motivate the z-test all you need
is the formula for the variance of the mean and the fact (accepted without
proof in an elementary course) that a mean of normals is normal.  To get
to the t-distribution you need all of this and also have to talk about
the sampling distribution of the SE estimate in the denominator and how
they combine to give yet another distribution which is free of the mean and
the nuisance parameter (a fact that depends on subtle properties of the
normal).  

One could take the cynical view that most intro students will get neither
of these, but short of that, the Z seems easier to motivate.  When I
taught out of Moore and McCabe, I usually tried to give some motivation
along these lines for the Z test/interval, and then when I got to the t I
waved my hands and said "when we estimate the variance instead of knowing
it in advance, the intervals have to be spread out a bit more as shown in
this table".

        Alan Zaslavsky
        Harvard Med School



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to