Stan Brown wrote:
> 
> On a quiz, I set the following problem to my statistics class:
> 
> "The manufacturer of a patent medicine claims that it is 90%
> effective(*) in relieving an allergy for a period of 8 hours. In a
> sample of 200 people who had the allergy, the medicine provided
> relief for 170 people. Determine whether the manufacturer's claim
> was legitimate, to the 0.01 significance level."
>

        A hypothesis test is set up ahead of time so that it can only 
give a definite answer of one sort. In this case, we have (at least)
three 
distinct possibilities.

        (0) The "advertiser's test": we want a definite result that makes the
manufacturer look good, so Ha agrees with the manufacturers claim. To do
this the manufacturer's claim must be slightly restated as "our medicine
is *more*  than 90% effective"; as the exact 90% value has prior
probability 0 this is not a problem. H0 is actually the original claim;
and the hoped-for outcome is to reject it because the number of
successes is too large.  The manufacturer is not entitled to do a
1-tailed test just to  shrink the reported p-value. Using a 1-tailed
test is to say "I want all my Type I errors to be ones that let us get
away with inflated claims."  This is what the students did:

> But -- and in retrospect I should have seen it coming -- some
> students framed the hypotheses so that the alternative hypothesis
> was "the drug is effective as claimed." They had
>         Ho: p <= .9; Ha: p > .9; p-value = .9908.

Ethical behaviour is to do a two-tailed test, and report/act on a
rejection in either direction.


        It is not necessary to say "there is a difference but we don't know in
which direction"; a two-tailed test can legitimately have three outcomes
(reject low, reject high, not enough data). (There is a potential new
type of error in which we reject in the wrong tail; this *ought* to be
called a Type III error were the name not already taken [as a somewhat
misleading in-joke akin to the "Eleventh Commandment"] to mean "testing
the wrong hypothesis" or something similar.  It is easy to show that the
probability of this is low enough to ignore if alpha is even moderately
low, as the distance between tails is twice the distance from the mean
to the tail.)


        (1) The "consumer advocate's test": we want a definite result that
makes the manufacturer look bad, so H0 is the manufacturer's
claim, Ha is that the claim is wrong, and the p-value is to be used 
as an indication of reason to believe H0 wrong (if so).  Using a
one-sided test here is akin to saying "I want all my type I errors to be
ones that make the manufacturer look bad".  Ethical behaviour here is to
do a two-sided test and report a result in either direction.  


        (2) the "quality controller's test": H0 is the manufacturer's
claim, Ha is that the claim is wrong, and the p-value is to be used 
to balance risks. Here, I think, a one-tailed test is legitimate. 


        I claim that the consumer advocate and the manufacturer *should* be
doing the same test in situations 0 and 1. Both should be reporting a
p-value of 0.0184, both should be interpreting it as "the medicine is
less effective than claimed", and the manufacturer should take action by
either improving the product or modifying the advertisements.

        -Robert Dawson


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to