Re: The meaning of the p value

2001-02-02 Thread Herman Rubin

In article [EMAIL PROTECTED],
Robert J. MacG. Dawson [EMAIL PROTECTED] wrote:


Bruce Weaver wrote:

 Suppose you were conducting a test with someone who claimed to have ESP,
 such that they were able to predict accurately which card would be turned
 up next from a well-shuffled deck of cards.  The null hypothesis, I think,
 would be that the person does not have ESP.  Is this null false?

   Technically, the null hypothesis is that 

   P(card is predicted correctly) = 1/52 

   - it is a statement about parameter values. Thus, any bias, no matter
how slight, affecting this, would make Ho false - whether the subject
had ESP or no.

   For instance, if the shuffling method tended to make a card slightly
less likely to come up twice in a row than one would expect, *even by a
few parts in a million*, and if the subject avoided such guesses, then
Ho would indeed be false.  

   -Robert Dawson

This indicates a problem almost completely ignored by those
using statistics; the hypothesis tested is almost never the
hypothesis claimed to be tested.  One cannot actually produce
a random sample with a given probability distribution; at
best, one can come close.

How much does this matter?  The indications from my paper 
in the First Purdue Symposium are that if the effects are
small compared to the standard deviation of the usual
estimators, it does not make much difference; I believe
that this is true in more generality than the question
studied there.  In the ESP problem above, detecting even
a few parts in a thousand would require on the order of
a million observations, so one can "get away" with it.

But this is not the case with fixing a p value.  Most
testing problems have the property that the appropriate
procedure to be used corresponds to a p value for that
problem AND THAT SAMPLE SIZE, but the p value to be used
depends quite substantially on the sample size.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-02 Thread Duncan Murdoch

On 2 Feb 2001 01:12:59 -0800, [EMAIL PROTECTED] (Will Hopkins) wrote:

I've been involved in off-list discussion with Duncan Murdoch.  At one 
stage there I was about to retire in disgrace.  But sighs of relief... his 
objection is Bayesian. 

Just to clarify, I don't think this is a valid summary of what I said.
What I said offline was just a longer version of what I said online in
[EMAIL PROTECTED].

Duncan Murdoch


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-02 Thread Herman Rubin

In article [EMAIL PROTECTED],
Will Hopkins [EMAIL PROTECTED] wrote:
I accept that there are unusual cases where the null hypothesis has a 
finite probability of being be true, but I still can't see the point in 
hypothesizing a null, not in biomedical disciplines, anyway.

If only we could replace the p value with a probability that the true 
effect is negative (or has the opposite sign to the observed effect).  The 
easiest way would be to insist on one-tailed tests for everything.  Then 
the p value would mean exactly that.  An example of two wrongs making a right.

If you want to say something about the probability that
a statement about the state of nature is true, it is
necessary to start with a prior distribution.  There
is no controversy about the use of Bayes Theorem to
get posterior distributions.

But this has nothing to do with p values, except that
more extreme values of one in a given situation generally
go with more extreme values of the other in a given
experimental situation.  It is not true across situations.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-02 Thread Jerry Dallal

Herman Rubin wrote and I marked up:
 
 There is no way to use the present p-value by itself correctly
 with additional data.

I think the "with" is a typographical error and that "without" was
intended.  I comment only because I like it and plan to use a
modified version of it as "the law".  Something along the line of

There is *no way* to use a P value by itself correctly.  It must be
accompanied by additional data. 

I already say this but not as suscinctly.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-02 Thread Herman Rubin

In article [EMAIL PROTECTED],
Will Hopkins [EMAIL PROTECTED] wrote:
I've been involved in off-list discussion with Duncan Murdoch.  At one 
stage there I was about to retire in disgrace.  But sighs of relief... his 
objection is Bayesian.  OK.  The p value is a device to put in a 
publication to communicate something about precision of an estimate of an 
effect, under the assumption of no prior knowledge of the magnitude of the 
true value of the effect.

The p value does not communicate anything about the precision
of anything by itself.  

 If we assume no prior knowledge of the true 
value, then my claim stands:  the p value for a one-tailed test is the 
probability of an opposite true effect--any true effect opposite in sign or 
impact to that observed.

This is likewise false.  For a translation parameter, with a
uniform prior, it is correct, but only in this too often 
assumed, but also unreasonable, situation.  The use of this
prior as meaning "no prior knowledge" may lead to reasonable
actions, but for deciding whether the new or the old is better,
the p-value to use becomes 0.50.

I can't see how a Bayesian perspective dilutes or invalidates this 
interpretation.  The same Bayesian perspective would make you re-evaluate 
the p value under its conventional interpretation.

There is no way to use the present p-value by itself correctly
with additional data.

 In other words, if you 
have some other reason for believing that the true value has the same sign 
as the observed value, reduce the p value in your mind.  Or if you believe 
it has opposite sign, increase it.

With composite hypotheses, one cannot simply use Bayes factors.

If we are stuck with p values, then I believe we should start showing 
one-tailed p values, along with 95% confidence limits for the 
effect.

The only reason p values are used as they are is that they have
become religion.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-02 Thread Herman Rubin

In article [EMAIL PROTECTED],
Jerry Dallal  [EMAIL PROTECTED] wrote:
Herman Rubin wrote and I marked up:

 There is no way to use the present p-value by itself correctly
 with additional data.

I think the "with" is a typographical error and that "without" was
intended.  I comment only because I like it and plan to use a
modified version of it as "the law".  Something along the line of

There is *no way* to use a P value by itself correctly.  It must be
accompanied by additional data. 

I already say this but not as suscinctly.

There are two ways of reading my statement.  By itself,
your interpretation is quite correct.  But my intention
was to consider what happens if additional experiments
are available, in which case I do not know a reasonable
way to use the p value with that further information as
a summary of the data yielding the p value.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-01 Thread Robert J. MacG. Dawson



Will Hopkins wrote:
 
 I accept that there are unusual cases where the null hypothesis has a
 finite probability of being be true, but I still can't see the point in
 hypothesizing a null, not in biomedical disciplines, anyway.
 
 If only we could replace the p value with a probability that the true
 effect is negative (or has the opposite sign to the observed effect).  The
 easiest way would be to insist on one-tailed tests for everything.  Then
 the p value would mean exactly that.  An example of two wrongs making a right.

No, a one-tailed test doesn't work; it is still computed using the null
value. To find what you want you need Bayesian techniques... but then
(if your prior distribution is valid) you can answer the question you
*really* wanted to answer - "what is the probability that the effect
exists?"
Or even "what is the distribution of the parameter value?"

-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-01 Thread Will Hopkins

I've been involved in off-list discussion with Duncan Murdoch.  At one 
stage there I was about to retire in disgrace.  But sighs of relief... his 
objection is Bayesian.  OK.  The p value is a device to put in a 
publication to communicate something about precision of an estimate of an 
effect, under the assumption of no prior knowledge of the magnitude of the 
true value of the effect.  If we assume no prior knowledge of the true 
value, then my claim stands:  the p value for a one-tailed test is the 
probability of an opposite true effect--any true effect opposite in sign or 
impact to that observed.

I can't see how a Bayesian perspective dilutes or invalidates this 
interpretation.  The same Bayesian perspective would make you re-evaluate 
the p value under its conventional interpretation.  In other words, if you 
have some other reason for believing that the true value has the same sign 
as the observed value, reduce the p value in your mind.  Or if you believe 
it has opposite sign, increase it.

If we are stuck with p values, then I believe we should start showing 
one-tailed p values, along with 95% confidence limits for the 
effect.   Both these are far far easier to understand than hypothesis 
testing and statistical significance. Put a note in the Methods saying 
something like: "The p values, which were all derived from one-tailed 
tests, represent the probability that the true value of the effect is 
opposite in sign (correlations; differences or changes in means) or impact 
(relative risks, odds ratios) to that observed."

Will



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-01-31 Thread Herman Rubin

In article p04330120b69d3c8cd6ea@[139.80.121.126],
Will Hopkins [EMAIL PROTECTED] wrote:
At 4:17 PM -0600 30/1/01, Jay Warner wrote:
A technically correct conclusion is:  The sample of 100 has a mu 
different than 100.  there is a 0.08 prob ability (or 0.02, or 
0.008) that this statement is false.

Have I not said the same thing?  As p gets small, we are more 
confident that the null hypothesis is not valid.

I haven't followed this thread closely, but I would like to state the 
only valid and useful interpretation of the p value that I know.  If 
you observe a positive effect, then p/2 is the probability that the 
true value of the effect is negative.  Equivalently, 1-p/2 is the 
probability that the true value is positive.

This is true in the translation parameter case if one has a
uniform prior.  This is not always justifiable; one might
think that there is a reasonable possibility that the null
hypothesis is close to being correct.  In that case, the
statement is wrong.

The probability that the null hypothesis is true is exactly 0.  The 
probability that it is false is exactly 1.

I know of no "real" situation in which the null hypothesis, as
stated in connection with the distribution of observations,
could be correct.

Estimation is the name of the game.  Hypothesis testing belongs in 
another century--the 20th.  Unless, that is, you base hypotheses not 
on the null effect but on trivial effects...

This is an important problem, and can only be handled by 
decision-theoretic methods.  Are there any papers on this
in addition to mine in the First Purdue Symposium (1971)?

There is a general result here, but it is not what one
usually expects.  If the region where one should accept
the null is small compared to the precision of the 
usual estimator, one can treat this as a point null,
but should not fix a p value, but rather let the p value
be determined by the loss and LOCAL prior.  See my paper
with Sethuraman in 1965 for the large sample treatment
of this.

If the region is much larger than the usual confidence
interval, just see if the usual estimate is in the region.

In between, detailed consideration of the prior assumptions
make a difference.

Will


-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-01-31 Thread Bruce Weaver

On 30 Jan 2001, Will Hopkins wrote:

-- 8 ---

 I haven't followed this thread closely, but I would like to state the 
 only valid and useful interpretation of the p value that I know.  If 
 you observe a positive effect, then p/2 is the probability that the 
 true value of the effect is negative.  Equivalently, 1-p/2 is the 
 probability that the true value is positive.
 
 The probability that the null hypothesis is true is exactly 0.  The 
 probability that it is false is exactly 1.


Suppose you were conducting a test with someone who claimed to have ESP,
such that they were able to predict accurately which card would be turned
up next from a well-shuffled deck of cards.  The null hypothesis, I think, 
would be that the person does not have ESP.  Is this null false? 

And what about when one has a one-tailed alternative hypothesis, e.g., mu 
 100.  In this case, the null covers a whole range of values (mu  or = 
100).  Is this null false?  In such a case, one still uses the point null 
(mu = 100) for testing, because it is the most extreme case. If you can 
reject the point null of mu=100, you will certainly be able to reject the 
null if mu is actually some value less than 100.  But the point is, the 
null can be true.  

With a two-tailed alternative, the point null may not be true, but as one
of the regulars in these newsgroups often points out, we don't know the
direction of the difference.  So again, it makes sense to use the point 
null for testing purposes.


 Estimation is the name of the game.  Hypothesis testing belongs in 
 another century--the 20th.  Unless, that is, you base hypotheses not 
 on the null effect but on trivial effects...


Bob Frick has a paper with some interesting comments on this in the
context of experimental psychology.  In that context, he argues, models
that make "ordinal" predictions are more useful than ones that try to
estimate effect sizes, and certainly more generalizable.  (An ordinal
prediction is something like performance will be impaired in condtion B
relative to condition A.  Impairment might be indicated by slower
responding and more errors, for example.)

A lot of cognitive psychologists use reaction time as their primary DV. 
But note that they are NOT primarily interested in explaining all (or as
much as they can) of the variation in reaction time.  RT is just a tool
they use to make inferences about some underlying construct that really
interests them.  Usually, they are trying to test some theory which leads
them to expect slower responding in one condition relative to another, for
example--such as slower responding when distractors are present compared
to when only a target item appears.  The difference between these
conditions almost certainly will explain next to none of the overall
variation in RT, so eta-squared and omega-squared measures will not be
very impressive looking.  But that's fine, because the whole point is to
test the ordinal prediction of the theory--not to explain all of the
variation in RT.  If one was able to measure the underlying construct
directly, THEN it might make some sense to try estimating parameters.  But
with indirect measurements like RT, I think Frick's recommended approach
is a better one. 

There's my two cents.
-- 
Bruce Weaver
New e-mail: [EMAIL PROTECTED] (formerly [EMAIL PROTECTED]) 
Homepage:   http://www.angelfire.com/wv/bwhomedir/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-01-31 Thread Robert J. MacG. Dawson



Bruce Weaver wrote:

 Suppose you were conducting a test with someone who claimed to have ESP,
 such that they were able to predict accurately which card would be turned
 up next from a well-shuffled deck of cards.  The null hypothesis, I think,
 would be that the person does not have ESP.  Is this null false?

Technically, the null hypothesis is that 

P(card is predicted correctly) = 1/52 

- it is a statement about parameter values. Thus, any bias, no matter
how slight, affecting this, would make Ho false - whether the subject
had ESP or no.

For instance, if the shuffling method tended to make a card slightly
less likely to come up twice in a row than one would expect, *even by a
few parts in a million*, and if the subject avoided such guesses, then
Ho would indeed be false.  

-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-01-30 Thread Alan McLean

Will Hopkins wrote:
 

 
 I haven't followed this thread closely, but I would like to state the
 only valid and useful interpretation of the p value that I know.  If
 you observe a positive effect, then p/2 is the probability that the
 true value of the effect is negative.  Equivalently, 1-p/2 is the
 probability that the true value is positive.
 
 The probability that the null hypothesis is true is exactly 0.  The
 probability that it is false is exactly 1.
 
 Estimation is the name of the game.  Hypothesis testing belongs in
 another century--the 20th.  Unless, that is, you base hypotheses not
 on the null effect but on trivial effects...
 

With respect, Will, this is a very limited view of statistics in general
and hypothesis testing in particular. One of the features of this view
is that you think in terms of 'true values' rather than models. A null
hypothesis is not 'true' - it may or may not be 'valid' in the sense
that using it enables reasonable predictions.

The same comment can be made of any scientific theory. In what sense is
Relativity 'true'? But it enables reasonable predictions.

Estimation is obviously important - but hypothesis testing, properly
considered, is also essential.

Regards,
Alan


Alan McLean ([EMAIL PROTECTED])
Department of Econometrics and Business Statistics
Monash University, Caulfield Campus, Melbourne
Tel:  +61 03 9903 2102Fax: +61 03 9903 2007


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=