Re: The meaning of the p value
In article [EMAIL PROTECTED], Robert J. MacG. Dawson [EMAIL PROTECTED] wrote: Bruce Weaver wrote: Suppose you were conducting a test with someone who claimed to have ESP, such that they were able to predict accurately which card would be turned up next from a well-shuffled deck of cards. The null hypothesis, I think, would be that the person does not have ESP. Is this null false? Technically, the null hypothesis is that P(card is predicted correctly) = 1/52 - it is a statement about parameter values. Thus, any bias, no matter how slight, affecting this, would make Ho false - whether the subject had ESP or no. For instance, if the shuffling method tended to make a card slightly less likely to come up twice in a row than one would expect, *even by a few parts in a million*, and if the subject avoided such guesses, then Ho would indeed be false. -Robert Dawson This indicates a problem almost completely ignored by those using statistics; the hypothesis tested is almost never the hypothesis claimed to be tested. One cannot actually produce a random sample with a given probability distribution; at best, one can come close. How much does this matter? The indications from my paper in the First Purdue Symposium are that if the effects are small compared to the standard deviation of the usual estimators, it does not make much difference; I believe that this is true in more generality than the question studied there. In the ESP problem above, detecting even a few parts in a thousand would require on the order of a million observations, so one can "get away" with it. But this is not the case with fixing a p value. Most testing problems have the property that the appropriate procedure to be used corresponds to a p value for that problem AND THAT SAMPLE SIZE, but the p value to be used depends quite substantially on the sample size. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
On 2 Feb 2001 01:12:59 -0800, [EMAIL PROTECTED] (Will Hopkins) wrote: I've been involved in off-list discussion with Duncan Murdoch. At one stage there I was about to retire in disgrace. But sighs of relief... his objection is Bayesian. Just to clarify, I don't think this is a valid summary of what I said. What I said offline was just a longer version of what I said online in [EMAIL PROTECTED]. Duncan Murdoch = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
In article [EMAIL PROTECTED], Will Hopkins [EMAIL PROTECTED] wrote: I accept that there are unusual cases where the null hypothesis has a finite probability of being be true, but I still can't see the point in hypothesizing a null, not in biomedical disciplines, anyway. If only we could replace the p value with a probability that the true effect is negative (or has the opposite sign to the observed effect). The easiest way would be to insist on one-tailed tests for everything. Then the p value would mean exactly that. An example of two wrongs making a right. If you want to say something about the probability that a statement about the state of nature is true, it is necessary to start with a prior distribution. There is no controversy about the use of Bayes Theorem to get posterior distributions. But this has nothing to do with p values, except that more extreme values of one in a given situation generally go with more extreme values of the other in a given experimental situation. It is not true across situations. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
Herman Rubin wrote and I marked up: There is no way to use the present p-value by itself correctly with additional data. I think the "with" is a typographical error and that "without" was intended. I comment only because I like it and plan to use a modified version of it as "the law". Something along the line of There is *no way* to use a P value by itself correctly. It must be accompanied by additional data. I already say this but not as suscinctly. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
In article [EMAIL PROTECTED], Will Hopkins [EMAIL PROTECTED] wrote: I've been involved in off-list discussion with Duncan Murdoch. At one stage there I was about to retire in disgrace. But sighs of relief... his objection is Bayesian. OK. The p value is a device to put in a publication to communicate something about precision of an estimate of an effect, under the assumption of no prior knowledge of the magnitude of the true value of the effect. The p value does not communicate anything about the precision of anything by itself. If we assume no prior knowledge of the true value, then my claim stands: the p value for a one-tailed test is the probability of an opposite true effect--any true effect opposite in sign or impact to that observed. This is likewise false. For a translation parameter, with a uniform prior, it is correct, but only in this too often assumed, but also unreasonable, situation. The use of this prior as meaning "no prior knowledge" may lead to reasonable actions, but for deciding whether the new or the old is better, the p-value to use becomes 0.50. I can't see how a Bayesian perspective dilutes or invalidates this interpretation. The same Bayesian perspective would make you re-evaluate the p value under its conventional interpretation. There is no way to use the present p-value by itself correctly with additional data. In other words, if you have some other reason for believing that the true value has the same sign as the observed value, reduce the p value in your mind. Or if you believe it has opposite sign, increase it. With composite hypotheses, one cannot simply use Bayes factors. If we are stuck with p values, then I believe we should start showing one-tailed p values, along with 95% confidence limits for the effect. The only reason p values are used as they are is that they have become religion. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
In article [EMAIL PROTECTED], Jerry Dallal [EMAIL PROTECTED] wrote: Herman Rubin wrote and I marked up: There is no way to use the present p-value by itself correctly with additional data. I think the "with" is a typographical error and that "without" was intended. I comment only because I like it and plan to use a modified version of it as "the law". Something along the line of There is *no way* to use a P value by itself correctly. It must be accompanied by additional data. I already say this but not as suscinctly. There are two ways of reading my statement. By itself, your interpretation is quite correct. But my intention was to consider what happens if additional experiments are available, in which case I do not know a reasonable way to use the p value with that further information as a summary of the data yielding the p value. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
Will Hopkins wrote: I accept that there are unusual cases where the null hypothesis has a finite probability of being be true, but I still can't see the point in hypothesizing a null, not in biomedical disciplines, anyway. If only we could replace the p value with a probability that the true effect is negative (or has the opposite sign to the observed effect). The easiest way would be to insist on one-tailed tests for everything. Then the p value would mean exactly that. An example of two wrongs making a right. No, a one-tailed test doesn't work; it is still computed using the null value. To find what you want you need Bayesian techniques... but then (if your prior distribution is valid) you can answer the question you *really* wanted to answer - "what is the probability that the effect exists?" Or even "what is the distribution of the parameter value?" -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
I've been involved in off-list discussion with Duncan Murdoch. At one stage there I was about to retire in disgrace. But sighs of relief... his objection is Bayesian. OK. The p value is a device to put in a publication to communicate something about precision of an estimate of an effect, under the assumption of no prior knowledge of the magnitude of the true value of the effect. If we assume no prior knowledge of the true value, then my claim stands: the p value for a one-tailed test is the probability of an opposite true effect--any true effect opposite in sign or impact to that observed. I can't see how a Bayesian perspective dilutes or invalidates this interpretation. The same Bayesian perspective would make you re-evaluate the p value under its conventional interpretation. In other words, if you have some other reason for believing that the true value has the same sign as the observed value, reduce the p value in your mind. Or if you believe it has opposite sign, increase it. If we are stuck with p values, then I believe we should start showing one-tailed p values, along with 95% confidence limits for the effect. Both these are far far easier to understand than hypothesis testing and statistical significance. Put a note in the Methods saying something like: "The p values, which were all derived from one-tailed tests, represent the probability that the true value of the effect is opposite in sign (correlations; differences or changes in means) or impact (relative risks, odds ratios) to that observed." Will = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
In article p04330120b69d3c8cd6ea@[139.80.121.126], Will Hopkins [EMAIL PROTECTED] wrote: At 4:17 PM -0600 30/1/01, Jay Warner wrote: A technically correct conclusion is: The sample of 100 has a mu different than 100. there is a 0.08 prob ability (or 0.02, or 0.008) that this statement is false. Have I not said the same thing? As p gets small, we are more confident that the null hypothesis is not valid. I haven't followed this thread closely, but I would like to state the only valid and useful interpretation of the p value that I know. If you observe a positive effect, then p/2 is the probability that the true value of the effect is negative. Equivalently, 1-p/2 is the probability that the true value is positive. This is true in the translation parameter case if one has a uniform prior. This is not always justifiable; one might think that there is a reasonable possibility that the null hypothesis is close to being correct. In that case, the statement is wrong. The probability that the null hypothesis is true is exactly 0. The probability that it is false is exactly 1. I know of no "real" situation in which the null hypothesis, as stated in connection with the distribution of observations, could be correct. Estimation is the name of the game. Hypothesis testing belongs in another century--the 20th. Unless, that is, you base hypotheses not on the null effect but on trivial effects... This is an important problem, and can only be handled by decision-theoretic methods. Are there any papers on this in addition to mine in the First Purdue Symposium (1971)? There is a general result here, but it is not what one usually expects. If the region where one should accept the null is small compared to the precision of the usual estimator, one can treat this as a point null, but should not fix a p value, but rather let the p value be determined by the loss and LOCAL prior. See my paper with Sethuraman in 1965 for the large sample treatment of this. If the region is much larger than the usual confidence interval, just see if the usual estimate is in the region. In between, detailed consideration of the prior assumptions make a difference. Will -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
On 30 Jan 2001, Will Hopkins wrote: -- 8 --- I haven't followed this thread closely, but I would like to state the only valid and useful interpretation of the p value that I know. If you observe a positive effect, then p/2 is the probability that the true value of the effect is negative. Equivalently, 1-p/2 is the probability that the true value is positive. The probability that the null hypothesis is true is exactly 0. The probability that it is false is exactly 1. Suppose you were conducting a test with someone who claimed to have ESP, such that they were able to predict accurately which card would be turned up next from a well-shuffled deck of cards. The null hypothesis, I think, would be that the person does not have ESP. Is this null false? And what about when one has a one-tailed alternative hypothesis, e.g., mu 100. In this case, the null covers a whole range of values (mu or = 100). Is this null false? In such a case, one still uses the point null (mu = 100) for testing, because it is the most extreme case. If you can reject the point null of mu=100, you will certainly be able to reject the null if mu is actually some value less than 100. But the point is, the null can be true. With a two-tailed alternative, the point null may not be true, but as one of the regulars in these newsgroups often points out, we don't know the direction of the difference. So again, it makes sense to use the point null for testing purposes. Estimation is the name of the game. Hypothesis testing belongs in another century--the 20th. Unless, that is, you base hypotheses not on the null effect but on trivial effects... Bob Frick has a paper with some interesting comments on this in the context of experimental psychology. In that context, he argues, models that make "ordinal" predictions are more useful than ones that try to estimate effect sizes, and certainly more generalizable. (An ordinal prediction is something like performance will be impaired in condtion B relative to condition A. Impairment might be indicated by slower responding and more errors, for example.) A lot of cognitive psychologists use reaction time as their primary DV. But note that they are NOT primarily interested in explaining all (or as much as they can) of the variation in reaction time. RT is just a tool they use to make inferences about some underlying construct that really interests them. Usually, they are trying to test some theory which leads them to expect slower responding in one condition relative to another, for example--such as slower responding when distractors are present compared to when only a target item appears. The difference between these conditions almost certainly will explain next to none of the overall variation in RT, so eta-squared and omega-squared measures will not be very impressive looking. But that's fine, because the whole point is to test the ordinal prediction of the theory--not to explain all of the variation in RT. If one was able to measure the underlying construct directly, THEN it might make some sense to try estimating parameters. But with indirect measurements like RT, I think Frick's recommended approach is a better one. There's my two cents. -- Bruce Weaver New e-mail: [EMAIL PROTECTED] (formerly [EMAIL PROTECTED]) Homepage: http://www.angelfire.com/wv/bwhomedir/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
Bruce Weaver wrote: Suppose you were conducting a test with someone who claimed to have ESP, such that they were able to predict accurately which card would be turned up next from a well-shuffled deck of cards. The null hypothesis, I think, would be that the person does not have ESP. Is this null false? Technically, the null hypothesis is that P(card is predicted correctly) = 1/52 - it is a statement about parameter values. Thus, any bias, no matter how slight, affecting this, would make Ho false - whether the subject had ESP or no. For instance, if the shuffling method tended to make a card slightly less likely to come up twice in a row than one would expect, *even by a few parts in a million*, and if the subject avoided such guesses, then Ho would indeed be false. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
Will Hopkins wrote: I haven't followed this thread closely, but I would like to state the only valid and useful interpretation of the p value that I know. If you observe a positive effect, then p/2 is the probability that the true value of the effect is negative. Equivalently, 1-p/2 is the probability that the true value is positive. The probability that the null hypothesis is true is exactly 0. The probability that it is false is exactly 1. Estimation is the name of the game. Hypothesis testing belongs in another century--the 20th. Unless, that is, you base hypotheses not on the null effect but on trivial effects... With respect, Will, this is a very limited view of statistics in general and hypothesis testing in particular. One of the features of this view is that you think in terms of 'true values' rather than models. A null hypothesis is not 'true' - it may or may not be 'valid' in the sense that using it enables reasonable predictions. The same comment can be made of any scientific theory. In what sense is Relativity 'true'? But it enables reasonable predictions. Estimation is obviously important - but hypothesis testing, properly considered, is also essential. Regards, Alan Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =