This is one of the areas in which we cannot be precise enough. An
observed statistics is not a random variable, but
a realization of a random variable. Random variables
are theoretical or mathematical constructs, which are never observed
directly. In frequentist statistics the random variable corresponds with the
framework of (hypothetical) replications, in Bayesian statistics
with the (equally hypothetical) subjective beliefs.

Thus observed statistics do not have sampling distributions, the
corresponding random variables (of which we assume the statistics
are realizations) have sampling distributions, where the "sampling"
usually refers to a theoretical framework of repeated independent
trials.

As for log(ad/bc), any differentiable function of the observed
proportions is asymptotically normal. Of course, given the above,
this really should state that any differentiable function of the
random variables of which the observed proportions are supposedly
realizations is asymptotically normal (if suitably normalized).

Statistics is useful because it provides a hypothetical framework
of replications, and thus it often is most useful in areas in which
actual replications are impossible or very expensive.

At 0:50 +0000 07/21/2000, Ron Bloom wrote:
>Rich Ulrich <[EMAIL PROTECTED]> wrote:
>>  Ron's post never showed up on my server.
>
>>  I especially agreed with the first paragraph of Steve's answer.
>
>>  No one so far has posted a response that recognizes the total
>>  innocence of the original question --
>>  This is not, "Why do we see two things that are almost identical?"
>>  This is, "Why do we see two things 'that don't mean anything to me'?"
>
>>  If you just want one number, why not "the p-level"? 
>
>>  With the chi squared, you need to have the DF before you can look up
>>  the p-level.  The chi squared is best used as a "test-statistic" but
>>  it is not complete all by itself. 
>
>>  The Odds Ratio is not even a "test-statistic", but rather, an Effect
>>  size.  Having the OR is like having a mean-difference where you also
>>  need N and Standard deviation before you have a test.  With OR,  you
>>  have to have  the N, plus the marginal Ns, in order to get a Test.
>
>
>   Of course the Observed odds ratio *is* a test-statistic.
>Anything composed out of the observed data, being that
>they are themselves "random" quantities, is a statistical
>quantity; which is to say, it has a sampling distribution.
>
>   The typical epidemiological usage treats log-Odds as if it
>were an asymptotic normal variate; with a standard deviation
>estimated from the cell values by the usual 1st-order part
>of a moment expansion.  They then proceed to place "confidence"
>interval about log-Odds, and thus, about the Odds itself.
>
>In my mind that certainly *does* treat the observed Odds-ratio
>as a random variable, which is to say, as a statistic, whose
>role is as an "estimator" of a putative "population odds-ratio".
>
>Now, I have not gone through the math to see to what extent
>and why log(ad/bc) *is* (and under what conditions) an
>asymptotic gaussian, but I suspect that the argument is
>similar to that one which establishes the asymptotic
>normality of the log-likelihood of the observed table. 
>This, too, is analogous to the argument establishing
>asymptotic "chi-squared-ness" of the observed chi-squared.
>
>But that is a side-issue.  The original question I had
>was a hetergeneous one.
>
>Now I think I see part of the answer.  One can use chi-square
>and N (the total cell number) to look up a tail probability
>for the observed table, and use the odds-ratio as a measure of   
>effect size that is independent of N.  On the other hand,
>one *could* look the observed odds ratio up in a table
>and compute *its* chance probability relative to the
>"null" hypothesis that the true OR=1, and then use something
>related to chi-squared (normalized somehow to N) as
>the measure of effect size.  It seems to me that the
>choice of statistical indicator and effect-size indicator
>is to a large part a matter of convention.
>
>The other part of my question concerned the practise of
>software packages printing out a slew of p-values based
>on different algorithmic approaches to computing the *same*
>test-statistic.  They are then served up with their
>authors names attached -- and the user is free to quote
>them all, as if the prestige of that learned assembly
>somehow inheres in his own individual decision.  The
>more voices in the room, perhaps the less anyone will
>notice one's own indecisiveness? 
>
>At all odds, I myself must now defer to the collective
>judgement...
>
>Ron
>
>
>=================================================================
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
>=================================================================

-- 
===
Jan de Leeuw; Professor and Chair, UCLA Department of Statistics;
US mail: 8142 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: [EMAIL PROTECTED]
    http://www.stat.ucla.edu/~deleeuw and http://home1.gte.net/datamine/
============================================================================
          No matter where you go, there you are. --- Buckaroo Banzai
                   http://www.stat.ucla.edu/sounds/nomatter.au
============================================================================


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to