In article <9deiug$l0h$[EMAIL PROTECTED]>,
Ronald Bloom  <[EMAIL PROTECTED]> wrote:

>Significance tests for 2x2 tables require that the single observed
>table be regarded as if it were, (under the null hypothesis of 
>"uniformity" or "independence") but a single instance drawn at  
>random from a universe of replicates.  Insofar as there are at 
>least three well-known distinct such "sample spaces" that one
>might arguably propose as reasonable models of the universe
>of replicates, different probability models by which the 
>"extremity" of the observed table, under the null hypothesis,
>do arise.

I can even provide more.  But from the standpoint of
classical statistics, it makes little difference.  From
the standpoint of decision theory it does, but then one
would not be doing anything like fixing a significance
level in the first place.

 This has given rise over the years to misunderstandings
>between proponents of different small-sample inferential tests
>of signifance for 2x2 tables.  But the disputes seem largely to
>be due to the failure of the disputants to identify precisely
>that particular probability setup which is correct for the
>particular problem at hand.  

>At least three distinct such ways of regarding a given 2x2 table can 
>be distinguished:

>1.) both row and column marginals regarded as fixed, and under
>the null hypothesis of uniformity, the observed table is treated 
>as a random sample from the finite set of permutations of all
>2x2 tables satisfying that constraint.  This sample-space model
>gives rise to the hypergeometric distribution for the 
>probability of the observed table; thus the "Fisher Exact" test.

The advantage of this one is that an exact test of the
prescribed level can be produced.

>2.) The two row (col)marginals are treated as independent; and the
>observed table under the null hypothesis is regarded as 
>being the result of two independent random samples from 
>identical binomial distributions.  The significance test used
>in this case is identical to the elementary test for the
>difference between two sample proportions.

This is a much more complicated testing situation than you
seem to think.  Because of the nuisance parameters, it is
essentially impossible to come up with a "natural" test
at the precise level, especially for small samples.

>3.) Only the total cell sum T is regarded as fixed.  The 
>observed table, under the null hypothesis, is regarded as a 
>random draw of four cell values satisfying the constraint
>that their total T is specified.  This leads to a 
>multinomial distribution.                                  

>Each one of these probability setups 1-3 gives rise to a somewhat
>different small-sample inferential test.  In particular, 
>the schemes (1),(2),(3) give rise to distributions conditioned
>on 3, 2, and 1 fixed parameters respectively.

But these parameters are unknown.  Testing with nuisance
parameters is very definitely not easy, and exact tests
are hard to come by.  Even in other types of problems,
conditional tests are often used.  In fact, in many
practical problems, the sample size itself need not be
fixed.  It is not uncommon to use the number of
observations as if it were a fixed sample size, and it is
easy to give examples where this can be shown not to do
what is wanted.

         Since, for
>large cell values, the large-sample approximations to all
>of these distributions (apparently?) converge to the
>CHi-Squared distribution, it is only in situations with
>small cell sizes that the controversy over choice of
>probability model is of practical (?) import.

As long as the conditional probabilities are the same,
and one uses one of the scenarios you mentioned, the
distribution of the Fisher exact test given the marginals
is as stated.  Thus the probability that the test at a
given level rejects is precisely the stated level in all
of these cases, assuming that randomized testing is used.

If one uses a decision approach, none of this is correct,
even if the Fisher model happens to be true.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to