Re: p value
Hi On 2 Nov 2001, Donald Burrill wrote: On Fri, 2 Nov 2001, jim clark wrote: I would hate to ressurect a debate from sometime in the past year, but the chi-squared is a non-directional (commonly referred to as two-tailed) test, although it is true that you only consider one end (tail) of the distribution. Surely this depends on WHICH chi^2 test one is discussing. What you write is true (if, in my view, misleading) about SOME chi^2 tests, but not about ALL chi^2 tests. Don is correct. I (perhaps wrongly) took standard chi-square, which the original post referred to, as chi-square for a contingency table, the most common use in the psychological literature that I am familiar with. I'm not sure in what sense what I wrote was misleading (unless this IS going to resurrect the earlier debate). Just as the upper end of the F distribution contains both tails of the t (it is t folded over), This is not strictly true; and to the extent that it IS true, it is true only of the F distribution with 1 and k degrees of freedom, which can be argued to be a kind of folded version of the t distribution with k degrees of freedom. Folded over I consider misleading, because it suggests that you could see the shape of the distribution by taking a standard central t distribution and mirroring it about zero. But in fact the shape of the distribution changes, as well as the folding: values less than one are systematically shifted toward zero (and the shift is greater the further the value is from 1), while values greater than one are systematically shifted toward infinity (and the shift is greater the further the value is from 1). Thus the SHAPE of the F distribution (with 1 and k degrees of freedom) is distinctly different from the shape you'd get by merely creating a mirror image around zero. My cryptic both tails was meant to refer to the probabilities and not to the details of the shape of the distribution, as Don mentioned. And perhaps it was presumptious to not say that this held only when the numerator df was 1, but this is a statistics newsgroup. the chi^2 contains both ends of the z (normal) distribution (i.e., z is folded over). And, correspondingly, this is true only for chi^2 with 1 degree of freedom, and subject to the same reservations about shape as those mentioned above with respect to the F distribution. Same comments as above. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p value
[EMAIL PROTECTED] (dennis roberts) wrote most software will compute p values (say for a typical two sample t test of means) by taking the obtained t test statistic ... making it both + and - ... finding the two end tail areas in the relevant t distribution ... and report that as p for example ... what if we have output like: N Mean StDev SE Mean exp 20 30.80 5.20 1.2 cont 20 27.84 3.95 0.88 Difference = mu exp - mu cont Estimate for difference: 2.95 95% CI for difference: (-0.01, 5.92) T-Test of difference = 0 (vs not =): T-Value = 2.02 P-Value = 0.051 DF = 35 for 35 df ... minitab finds the areas beyond -2.20 and + 2.02 ... adds them together .. and this value in the present case is .051 now, traditionally, we would retain the null with this p value ... and, we generally say that the p value means ... this is the probability of obtaining a result (like we got) IF the null were true but, the result WE got was finding a mean difference in FAVOR of the exp group ... however, the p value does NOT mean that the probability of finding a difference IN FAVOR of the exp group ... if the null were true ... is .051 ... right? since the p value has been calculated based on BOTH ends of the t distribution ... it includes both extremes where the exp is better than the control ... AND where the cont is better than the exp thus, would it be fair to say that ... it is NOT correct to say that the p value (as traditionally calculated) represents the probability of finding a result LIKE WE FOUND ... if the null were true? that p would be 1/2 of what is calculated this brings up another point ... in the above case ... typically we would retain the null ... but, the p of finding the result LIKE WE DID ... if the null were true ... is only 1/2 of .051 ... less than the alpha of .05 that we have used thus ... what alpha are we really using when we do this? this is just a query about my continuing concern of what useful information p values give us ... and, if the p value provides NO (given the results we see) information as to the direction of the effect ... then, again ... all it suggests to us (as p gets smaller) is that the null is more likely not to be true ... given that it might not be true in either direction from the null ... how is this really helping us when we are interested in the treatment effect? [given that we have the direction of the results AND the p value ... nothing else] I fail to see the problem. If the researcher has a priori expectations about the *direction* of the effect, he should use a one-sided significance test. That's what they are for, aren't they? Chris = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: p value
Dennis wrote: it is NOT correct to say that the p value (as traditionally calculated) represents the probability of finding a result LIKE WE FOUND ... if the null were true? that p would be ½ of what is calculated. Jones and Tukey (A sensible formulation of the significance test, Psychological Methods, 2000, 5, 411-414) recently suggested that the p which should be reported is the area of the t distribution more positive or more negative (but not both) than the value of t obtained, just as Dennis suggests in his post. ~~~ Karl L. Wuensch, Department of Psychology, East Carolina University, Greenville NC 27858-4353 Voice: 252-328-4102 Fax: 252-328-6283 mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] http://core.ecu.edu/psyc/wuenschk/klw.htm http://core.ecu.edu/psyc/wuenschk/klw.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: p value
At 05:06 PM 11/2/01 -0500, Wuensch, Karl L wrote: Dennis wrote: it is NOT correct to say that the p value (as traditionally calculated) represents the probability of finding a result LIKE WE FOUND ... if the null were true? that p would be ½ of what is calculated. Jones and Tukey (A sensible formulation of the significance test, Psychological Methods, 2000, 5, 411-414) recently suggested that the p which should be reported is the area of the t distribution more positive or more negative (but not both) than the value of t obtained, just as Dennis suggests in his post. i would not disagree with this ... but, we have to realize that software (most i think) does NOT do it that way ... if we did adopt the position of just reporting the p beyond the point you got ... either to the right side or left side but not both ... then, what will we use as the cut value ... .025??? or ... .05 as a 1 tail test? for rejecting the null? we certainly will have a problem continuing to say that we set alpha at .05 ... in the usual two tailed sense ~~~ Karl L. Wuensch, Department of Psychology, East Carolina University, Greenville NC 27858-4353 Voice: 252-328-4102 Fax: 252-328-6283 mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] http://core.ecu.edu/psyc/wuenschk/klw.htm http://core.ecu.edu/psyc/wuenschk/klw.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p value
Dennis Roberts [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... let's say that you do a simple (well executed) 2 group study ... treatment/control ... and, are interested in the mean difference ... and find that a simple t test shows a p value (with mean in favor of treatment) of .009 while it generally seems to be held that such a p value would suggest that our null model is not likely to be correct (ie, some other alternative model might make more sense), does it say ANYthing more than that? You could use it in conjunction with your sample/group sizes to get an idea of effect size. For example, if you got that p-value with group sizes of 40 that could be a very interesting result. However, if each group contained 100,000 subjects it may not be so interesting because the effect size will be so much smaller. cheers Michelle = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p value
In article IOet7.11245$[EMAIL PROTECTED], Magenta [EMAIL PROTECTED] wrote: Dennis Roberts [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... let's say that you do a simple (well executed) 2 group study ... treatment/control ... and, are interested in the mean difference ... and find that a simple t test shows a p value (with mean in favor of treatment) of .009 while it generally seems to be held that such a p value would suggest that our null model is not likely to be correct (ie, some other alternative model might make more sense), does it say ANYthing more than that? You could use it in conjunction with your sample/group sizes to get an idea of effect size. For example, if you got that p-value with group sizes of 40 that could be a very interesting result. However, if each group contained 100,000 subjects it may not be so interesting because the effect size will be so much smaller. What should be done is to give the likelihood function, which contains the relevant information. One can carry out a simple calculation to show that the idea of a nearly constant p value is WRONG. Feel free to change the model and weights; the results will be somewhat similar, and this one is easy to calculate without using numerical methods. Suppose that one wishes to test that the mean \mu of a distribution is 0. The importance of rejecting the hypothesis if it is true is one; the importance of accepting the hypothesis if it is false, and \mu lies in a set of area A, is A/(2\pi). Let the data be summarized by a normal vector with mean \mu and covariance matrix vI. Then it can be shown that the optimal procedure is to use a p value of v, assuming v 1. If v = 1, just reject. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p value
In article [EMAIL PROTECTED], Dennis Roberts [EMAIL PROTECTED] wrote: let's say that you do a simple (well executed) 2 group study ... treatment/control ... and, are interested in the mean difference ... and find that a simple t test shows a p value (with mean in favor of treatment) of .009 while it generally seems to be held that such a p value would suggest that our null model is not likely to be correct (ie, some other alternative model might make more sense), does it say ANYthing more than that? specifically, does the p value in and of itself impute ANY information about the non null possibilities being in the direction favoring the treatment group? or, just that the null model is not very plausible bottom line: is there any value added information imparted from the p value other than a statement about the null? Does it even state that? The posterior odds ratio for a symmetric prior on the mean of a normal random variable over that of the null is less than 4 for a p value of .05. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p value
My opinion, FWIW: The answer to your question in a strict fashion, assuming the experiment is well designed, depends to a large extent on your a priori null hypothesis and how you performed the statistical test. In this case, presuming that you used a two-sided p value and that you established 0.05 as your p value threshold for accepting/rejecting the null hypothesis, then the p value of 0.009 indicates that the null hypothesis was rejected. Using a two sided p value your a priori null hypothesis would be that there is no difference between the means in either direction. Thus this p value, stricly speaking, cannot tell you anything about the direction of the difference. If you established an a priori null hypothesis that was direction specific, then you would have to use a one-sided p value to accept or reject that directional null hypothesis. The problem with this approach is that the difference may be contextually significant in the direction opposite the one that your hypothesis is based upon and you may find yourself in a position of strictly having to accept the null hypothesis, since the one-sided p value may be 0.05 in that case. It is for the latter issue, that many folks will not use a one-sided p value in such situations, unless of course a difference in the opposite direction of the null hypothesis is of no consequence. For example, unless a new treatment is better than the current gold standard, you don't care. On the other hand, you may (or should) care if the new treatment is worse than the current gold standard. The danger, in a strict experimental fashion, is to perform the analysis on the data, determine the direction of the difference and then apply the appropriate one-sided test to the data. I have seen this done by others. This is a violation of the basic experimental process, since you already have performed the analysis and have defined a result-based null hypothesis. In my mind, bad form. The null hypothesis should be defined before you know anything about the data. -- Marc Schwartz To Reply Remove NOSPAM in E-Mail Address Dennis Roberts [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... let's say that you do a simple (well executed) 2 group study ... treatment/control ... and, are interested in the mean difference ... and find that a simple t test shows a p value (with mean in favor of treatment) of .009 while it generally seems to be held that such a p value would suggest that our null model is not likely to be correct (ie, some other alternative model might make more sense), does it say ANYthing more than that? specifically, does the p value in and of itself impute ANY information about the non null possibilities being in the direction favoring the treatment group? or, just that the null model is not very plausible bottom line: is there any value added information imparted from the p value other than a statement about the null? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p-value of one-tailed test
Thanks for your response. All of you are really helpful. Erik = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p-value of one-tailed test
if you are talking about a t test for means ... most software would automatically give a two tailed p value ... unless you specify otherwise (which software usually will let you do) here is the typical example Two-sample T for C1 vs C2 N Mean StDev SE Mean C1 10 25.70 2.87 0.91 C2 10 27.50 3.66 1.2 Difference = mu C1 - mu C2 Estimate for difference: -1.80 95% CI for difference: (-4.90, 1.30) T-Test of difference = 0 (vs not =): T-Value = -1.22 P-Value .238 when ns are 10 for each ... df would be 18 for the two sample t (approximately) ... so, here is what a t distribution with df=18 looks like :: : .:.::.:::. . :... ::: . . .. .::... .. . .. ---+-+-+-+-+-+---C3 -3.0 -1.5 0.0 1.5 3.0 4.5 the p value of .238 is figured in the following way: from 0 ... go to the negative side to -1.22 ... and also from 0 to the right side to +1.22 ... and find the area BELOW -1.22 and ABOVE +1.22 ... this is the p value of .238 that gets printed out ... two tails ... At 11:25 AM 4/4/01 -0500, auda wrote: Hi, What is the p-value of a t-statistic significant (significant level shown by the software is p) in the wrong direction in an one-tailed test? Should we modified it to (1-p)? Or it is just p? Erik = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
p-value language (was: Re: p value quibble ... ala d burrill)
I've taken the liberty of copying this to the edstat list, and therefore have quoted the original posting in full, despite having (at the moment) a comment on only one part of it. -- DFB. On Tue, 29 Aug 2000, Paul Dudgeon wrote: Somewhat tangential to the discussion last week about p values, I'd be interested in any comments on the following: I find one of the hardest aspects of teaching statistical inference to students is the linguistic contortions that can arise in moving from a strict formal definition of what an obtained p value means in NHST to the kind of informal, but easier to write/read, descriptive interpretation that is typically given, say, in journal articles. There are numerous instances in the literature of where even the highly regarded (e.e., Cohen) have come unstuck in trying to express the meaning of p(Data | Ho = True) in more everyday English. I thought that presenting students with a range of what are both acceptable/correct and unacceptable/incorrect interpretations might assist in making their understanding clearer (I have several of my own, but I'm sure they are by no means exhaustive of what is possible). So, I'd be grateful to know what do list members think are: (a) unacceptable/correct, and (b) acceptable/correct ways of more informally describing (i) significant (i.e., say p .05), and (ii) non-significant p values from an analysis like a t-test. What I have in mind, for instance, is if we found p = .42, then - "we have no strong evidence to reject the assumption that the mean scores of the two groups differ" is OK, but - "the results demonstrate the two means are the same" is not OK because this could be interpreted as implying that the obtained p = p(Ho = True | Data) Well, not this so much as because the assertion "the two means are the same" could (should?) be interpreted as implying that the probability of a Type II error (against a minimal useful difference, aka MUD) is acceptably low" (when in fact "p = .42" does not of itself imply ANYTHING about pr{Type II error} or, equivalently, about power). etc. To my mind, statements like "the results are (not) statistically significant at the .05 level" seem quite vacuous to most students provide little insight into what is really going on. I hope what I'm after is clear from the above. Thanks for any contributions (either public or private) if there's reasonable interest, I'll post a summary back to the list. Best wishes, Paul Dudgeon AERA Division D: Measurement and Research Methodology Forum - AERA Home Page on the World Wide Web: http://www.aera.net List Service Info http://lists.asu.edu/cgi-bin/wa To cancel your subscription address an email message to [EMAIL PROTECTED] containing only the message UNSUB AERA-D Address problems with your subscription to: [EMAIL PROTECTED] - Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =