Re: Interpreting p-value = .99
Jerry Dallal <[EMAIL PROTECTED]> wrote in sci.stat.edu: >"Robert J. MacG. Dawson" wrote: > >> > > But I don't see why either the advertiser or the consumer advocate >> > would, or should, do a two-tailed test. >> >> The idea is that the "product" of these tests is a p-value to be used >> in support of an argument. The evidence for the proposal is not made any >> stronger by the tester's wish for a certain outcome; so the tester >> should not artificially halve the reported p-value. >> >> Superficially, the idea of halving your p-values, doubling your chance >> of reporting a "statistically significant" result in your favored >> direction if there is really nothing there, and as a bonus, doing a >> David-and-Uriah job ("And he wrote in the letter, saying, Set ye Uriah >> in the forefront of the hottest battle, and retire ye from him, that he >> may be smitten, and die.") on any possible finding in the other >> direction, may seem attractive. A moment's thought should persuade us >> that it is not ethical. >> >> -Robert Dawson > >I'm not sure I understand the argument, Oh good -- I thought it was just me! -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
"Robert J. MacG. Dawson" wrote: > > > But I don't see why either the advertiser or the consumer advocate > > would, or should, do a two-tailed test. > > The idea is that the "product" of these tests is a p-value to be used > in support of an argument. The evidence for the proposal is not made any > stronger by the tester's wish for a certain outcome; so the tester > should not artificially halve the reported p-value. > > Superficially, the idea of halving your p-values, doubling your chance > of reporting a "statistically significant" result in your favored > direction if there is really nothing there, and as a bonus, doing a > David-and-Uriah job ("And he wrote in the letter, saying, Set ye Uriah > in the forefront of the hottest battle, and retire ye from him, that he > may be smitten, and die.") on any possible finding in the other > direction, may seem attractive. A moment's thought should persuade us > that it is not ethical. > > -Robert Dawson I'm not sure I understand the argument, but it may be irrelevant regardless. Most consumer protection laws are written to punish the "instance" and have nothing to do with "statistics" in general or means in particular. This protects you from my friend the shopkeeper who puts 1 lb in your 5 lb bag of sugar and 9.1 lbs in mine. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Stan Brown wrote: I see why the quality controller would want to > do a two-tailed test: the product should not be outside > manufacturing parameters in either direction. (Presumably the QC > person would be testing the pills themselves, not patients taking > the pills.) Actually, the quality controller's test is a slight misnomer here, because we aren't talking in this problem (as you more or less observed) about standard QC methodology. Standard QC doctrine, from what I hear, generally goes for repeatability, and "better than specified" is *not* good. ("So, how did you do in the QC Methods exam?" "My score was three sigmas above the class average... so the prof failed me.") The question dealt with a situation, though, in which only one direction of deviation is bad. Thus, the test might legitimately be one-sided. The reason is that the alpha value represents the risk of unnecessarily stopping the production line, reprinting the labels, or whatever. You *don't* need to do this if the product works better than advertised, so a one-sided alpha really is the risk of doing it unnecessarily. > But I don't see why either the advertiser or the consumer advocate > would, or should, do a two-tailed test. The idea is that the "product" of these tests is a p-value to be used in support of an argument. The evidence for the proposal is not made any stronger by the tester's wish for a certain outcome; so the tester should not artificially halve the reported p-value. Superficially, the idea of halving your p-values, doubling your chance of reporting a "statistically significant" result in your favored direction if there is really nothing there, and as a bonus, doing a David-and-Uriah job ("And he wrote in the letter, saying, Set ye Uriah in the forefront of the hottest battle, and retire ye from him, that he may be smitten, and die.") on any possible finding in the other direction, may seem attractive. A moment's thought should persuade us that it is not ethical. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
On Sat, 1 Dec 2001 08:20:45 -0500, [EMAIL PROTECTED] (Stan Brown) wrote: > [cc'd to previous poster] > > Rich Ulrich <[EMAIL PROTECTED]> wrote in sci.stat.edu: > >I think I could not blame students for floundering about on this one. > > > >On Thu, 29 Nov 2001 14:39:35 -0500, [EMAIL PROTECTED] (Stan Brown) > >wrote: > >> "The manufacturer of a patent medicine claims that it is 90% > >> effective(*) in relieving an allergy for a period of 8 hours. In a > >> sample of 200 people who had the allergy, the medicine provided > >> relief for 170 people. Determine whether the manufacturer's claim > >> was legitimate, to the 0.01 significance level." > > >I have never asked that as a question in statistics, and > >it does not have an automatic, idiomatic translation to what I ask. > > How would you have phrased the question, then? Though I took this > one from a book, I'm always looking to improve the phrasing of > questions I set in quizzes and exams. [ snip, rest] "In a LATER sample of 200 ... relief for ONLY 170 people." The Query you give after that should not pretend to be ultimate. Are you willing to ask the students to contemplate that the new experiment could differ drastically from the original sample and its conditions? "Is this result consistent with the manufacturer's claim?" - you might notice that this sounds 'weasel-worded.' Well, extremely-weasel-worded *ought* to be suiting, for *proper* statistical claims from non-randomized trials. For the example: I would expect 15% of a grab-sample being treated for 'allergy' would actually have flu or a cold. Maybe the actual experiment was more sophisticated? "What do you say about this result? (include a statistical test using a nominal alpha=.01)." Also, "Why do I include the word 'nominal' here?" Ans: It means 'tabled value' and it helps to emphasize that it is hard to frame a non-random trial as a test; the problem is not presented with any such framing. Hope this seems reasonable. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
At 08:29 AM 12/1/01 -0500, Stan Brown wrote: >How I would analyze this claim is that, when the advertiser says >"90% of people will be helped", that means 90% or more. Surely if we >did a large controlled study and found 93% were helped, we would not >turn around and say the advertiser was wrong! But I think that's >what would happen with a two-tailed test. > >Can you explain a bit further? would the advertiser feel he/she was wrong if the 90% value was a little less too ... within some margin of error from 90? probably not perhaps you want to say that the advertiser is claiming around 90% or MORE, or at LEAST 90% ... again ... we are getting far too hung up in how some hypothesis is stated ... is not the more important matter ... what sort of impact is there? if that is the case ... testing a null ... ANY null ... is really not going to help you you need to look at the SAMPLE data ... then ask yourself ... what sort of a real effect might there be if i got the sample results that i did? if you then want to superimpose on this a question ... i wonder if 90 or more could have been the truth ... fine but that is an after thought this does not call for a hypothesis test >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com >My reply address is correct as is. The courtesy of providing a correct >reply address is more important to me than time spent deleting spam. > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
[cc'd to previous poster] Rich Ulrich <[EMAIL PROTECTED]> wrote in sci.stat.edu: >I think I could not blame students for floundering about on this one. > >On Thu, 29 Nov 2001 14:39:35 -0500, [EMAIL PROTECTED] (Stan Brown) >wrote: >> "The manufacturer of a patent medicine claims that it is 90% >> effective(*) in relieving an allergy for a period of 8 hours. In a >> sample of 200 people who had the allergy, the medicine provided >> relief for 170 people. Determine whether the manufacturer's claim >> was legitimate, to the 0.01 significance level." >I have never asked that as a question in statistics, and >it does not have an automatic, idiomatic translation to what I ask. How would you have phrased the question, then? Though I took this one from a book, I'm always looking to improve the phrasing of questions I set in quizzes and exams. >I can expect that it means, "Use a 1% test." But, for what? >That claim could NEVER, legitimately, have been *based* >on these data. That is an idea that tries to intrude itself, >to me, and makes it difficult to address the intended question. Agreed! My idea, in reading that problem, was that the manufacturer claimed something for a product that has been on the market for some time, and some independent group, such as a newspaper or TV network, did a study to test the claim. > - By the way, it also bothers me that "90% effective" is >apparently translated as "effective for 90% of the people." >I wondered if the asterisk was supposed to represent "[sic]". The asterisk led to my note defining it as relieving symptoms for 90% of people who use it, and asking students to think whether the claim would also be true if it relieved symptoms for more than 90%. (I think the real-world answer is clearly Yes: If a product is claimed to help 90% of people and it actually helps 93%, we do not say the claim was false.) -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
[cc'd to previous poster; please follow up in newsgroup] Robert J. MacG. Dawson <[EMAIL PROTECTED]> wrote in sci.stat.edu: >Stan Brown wrote: >> "The manufacturer of a patent medicine claims that it is 90% >> effective(*) in relieving an allergy for a period of 8 hours. In a >> sample of 200 people who had the allergy, the medicine provided >> relief for 170 people. Determine whether the manufacturer's claim >> was legitimate, to the 0.01 significance level." > A hypothesis test is set up ahead of time so that it can only >give a definite answer of one sort. In this case, we have (at least) >three distinct possibilities. I really like your presentation of the three possible tests as "advertiser's test", "consumer advocate's test", and "quality controller's test". I see why the quality controller would want to do a two-tailed test: the product should not be outside manufacturing parameters in either direction. (Presumably the QC person would be testing the pills themselves, not patients taking the pills.) But I don't see why either the advertiser or the consumer advocate would, or should, do a two-tailed test. Alan McLean seemed to agree that both would be one-tailed, if I understand him correctly. > (1) The "consumer advocate's test": we want a definite result that >makes the manufacturer look bad, so H0 is the manufacturer's >claim, Ha is that the claim is wrong, and the p-value is to be used >as an indication of reason to believe H0 wrong (if so). Using a >one-sided test here is akin to saying "I want all my type I errors to be >ones that make the manufacturer look bad". Ethical behaviour here is to >do a two-sided test and report a result in either direction. I don't get this. Why is that ethical behavior? How I would analyze this claim is that, when the advertiser says "90% of people will be helped", that means 90% or more. Surely if we did a large controlled study and found 93% were helped, we would not turn around and say the advertiser was wrong! But I think that's what would happen with a two-tailed test. Can you explain a bit further? -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Alan McLean <[EMAIL PROTECTED]> wrote in sci.stat.edu: >Stan, in practical terms, the conclusion 'fail to reject the null' is >simply not true. You do in reality 'accept the null'. The catch is that >this is, in the research situation, a tentative acceptance - you >recognise that you may be wrong, so you carry forward the idea that the >null may be 'true' but - on the sample evifdence - probably is not. > >On the other hand, this should also be the case when you 'reject the >null' - the rejection may be wrong, so the rejection is also tentative. >The difference is that the null has this privileged position. Thanks -- that makes some sense. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Hi On Thu, 29 Nov 2001, Stan Brown wrote: > But -- and in retrospect I should have seen it coming -- some > students framed the hypotheses so that the alternative hypothesis > was "the drug is effective as claimed." They had > Ho: p <= .9; Ha: p > .9; p-value = .9908. You might point out to students the possible irrationality of this framing of the question. By this reasoning would it not be the case that the strongest evidence for the claim that p<=.9 would be to have 0 successes (then p would be 1)? And that the worst case (i.e., null would be rejected) would be for all cases to be successful? This is, quite contrary, I expect, to what we would normally take to be negative and positive outcomes as far as the drug company is concerned. The most (only?) sensible interpretation of p=.9 is that at least 90% (i.e., p>=.9) would be successes. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
I think I could not blame students for floundering about on this one. On Thu, 29 Nov 2001 14:39:35 -0500, [EMAIL PROTECTED] (Stan Brown) wrote: > On a quiz, I set the following problem to my statistics class: > > "The manufacturer of a patent medicine claims that it is 90% > effective(*) in relieving an allergy for a period of 8 hours. In a > sample of 200 people who had the allergy, the medicine provided > relief for 170 people. Determine whether the manufacturer's claim > was legitimate, to the 0.01 significance level." > > (The problem was adapted from Spiegel and Stevens, /Schaum's > Outline: Statistics/, problem 10.6.) [ snip, rest ] "Determine whether the manufacturer's claim was legitimate, to the 0.01 significance level." I have never asked that as a question in statistics, and it does not have an automatic, idiomatic translation to what I ask. I can expect that it means, "Use a 1% test." But, for what? After I notice that the outcome was poorer than the claim, then I wonder if the test is, "Are these data consistent with the Claim? or do they tend to disprove it?" That seems some distance from the tougher, philosophical question of whether, at the time it was made, the claim was legitimate. That claim could NEVER, legitimately, have been *based* on these data. That is an idea that tries to intrude itself, to me, and makes it difficult to address the intended question. - By the way, it also bothers me that "90% effective" is apparently translated as "effective for 90% of the people." I wondered if the asterisk was supposed to represent "[sic]". -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Stan Brown wrote: > > On a quiz, I set the following problem to my statistics class: > > "The manufacturer of a patent medicine claims that it is 90% > effective(*) in relieving an allergy for a period of 8 hours. In a > sample of 200 people who had the allergy, the medicine provided > relief for 170 people. Determine whether the manufacturer's claim > was legitimate, to the 0.01 significance level." > A hypothesis test is set up ahead of time so that it can only give a definite answer of one sort. In this case, we have (at least) three distinct possibilities. (0) The "advertiser's test": we want a definite result that makes the manufacturer look good, so Ha agrees with the manufacturers claim. To do this the manufacturer's claim must be slightly restated as "our medicine is *more* than 90% effective"; as the exact 90% value has prior probability 0 this is not a problem. H0 is actually the original claim; and the hoped-for outcome is to reject it because the number of successes is too large. The manufacturer is not entitled to do a 1-tailed test just to shrink the reported p-value. Using a 1-tailed test is to say "I want all my Type I errors to be ones that let us get away with inflated claims." This is what the students did: > But -- and in retrospect I should have seen it coming -- some > students framed the hypotheses so that the alternative hypothesis > was "the drug is effective as claimed." They had > Ho: p <= .9; Ha: p > .9; p-value = .9908. Ethical behaviour is to do a two-tailed test, and report/act on a rejection in either direction. It is not necessary to say "there is a difference but we don't know in which direction"; a two-tailed test can legitimately have three outcomes (reject low, reject high, not enough data). (There is a potential new type of error in which we reject in the wrong tail; this *ought* to be called a Type III error were the name not already taken [as a somewhat misleading in-joke akin to the "Eleventh Commandment"] to mean "testing the wrong hypothesis" or something similar. It is easy to show that the probability of this is low enough to ignore if alpha is even moderately low, as the distance between tails is twice the distance from the mean to the tail.) (1) The "consumer advocate's test": we want a definite result that makes the manufacturer look bad, so H0 is the manufacturer's claim, Ha is that the claim is wrong, and the p-value is to be used as an indication of reason to believe H0 wrong (if so). Using a one-sided test here is akin to saying "I want all my type I errors to be ones that make the manufacturer look bad". Ethical behaviour here is to do a two-sided test and report a result in either direction. (2) the "quality controller's test": H0 is the manufacturer's claim, Ha is that the claim is wrong, and the p-value is to be used to balance risks. Here, I think, a one-tailed test is legitimate. I claim that the consumer advocate and the manufacturer *should* be doing the same test in situations 0 and 1. Both should be reporting a p-value of 0.0184, both should be interpreting it as "the medicine is less effective than claimed", and the manufacturer should take action by either improving the product or modifying the advertisements. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Gus Gassmann <[EMAIL PROTECTED]> wrote in sci.stat.edu: >Stan Brown wrote: >> "The manufacturer of a patent medicine claims that it is 90% >> effective(*) in relieving an allergy for a period of 8 hours. In a >> sample of 200 people who had the allergy, the medicine provided >> relief for 170 people. Determine whether the manufacturer's claim >> was legitimate, to the 0.01 significance level." >> But -- and in retrospect I should have seen it coming -- some >> students framed the hypotheses so that the alternative hypothesis >> was "the drug is effective as claimed." They had >> Ho: p <= .9; Ha: p > .9; p-value = .9908. > >I don't understand where they get the .9908 from. x=170, n=200, p'=.85, Ha: p>.9, alpha=.01 z = -2.357 On TI-83, normalcdf(-2.357,1E99) = .9908; i.e., 99.08% of the area is above z = -2.357. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Gus, Stan's two alternatives were correct as stated - they were two one sided tests, not a one sided and a two sided test. Stan, in practical terms, the conclusion 'fail to reject the null' is simply not true. You do in reality 'accept the null'. The catch is that this is, in the research situation, a tentative acceptance - you recognise that you may be wrong, so you carry forward the idea that the null may be 'true' but - on the sample evifdence - probably is not. On the other hand, this should also be the case when you 'reject the null' - the rejection may be wrong, so the rejection is also tentative. The difference is that the null has this privileged position. In areas like quality control, of course, it is quite clear that you decide, and act as if, the null is true or is not true. Regards, Alan Gus Gassmann wrote: > > Stan Brown wrote: > > > On a quiz, I set the following problem to my statistics class: > > > > "The manufacturer of a patent medicine claims that it is 90% > > effective(*) in relieving an allergy for a period of 8 hours. In a > > sample of 200 people who had the allergy, the medicine provided > > relief for 170 people. Determine whether the manufacturer's claim > > was legitimate, to the 0.01 significance level." > > > > (The problem was adapted from Spiegel and Stevens, /Schaum's > > Outline: Statistics/, problem 10.6.) > > > > I believe a one-tailed test, not a two-tailed test, is appropriate. > > It would be silly to test for "effectiveness differs from 90%" since > > no one would object if the medicine helps more than 90% of > > patients.) > > > > Framing the alternative hypothesis as "the manufacturer's claim is > > not legitimate" gives > > Ho: p >= .9; Ha: p < .9; p-value = .0092 > > on a one-tailed t-test. Therefore we reject Ho and conclude that the > > drug is less than 90% effective. > > > > But -- and in retrospect I should have seen it coming -- some > > students framed the hypotheses so that the alternative hypothesis > > was "the drug is effective as claimed." They had > > Ho: p <= .9; Ha: p > .9; p-value = .9908. > > I don't understand where they get the .9908 from. Whether you test a > one-or a two-sided alternative, the test statistic is the same. So the > p-value for the two-sided version of the test should be simply twice > the p-value for the one-sided alternative, 0.0184. Hence the paradox > you speak of is an illusion. > > Unfortunately for you, the two versions of the test lead to different > conclusions. If the correct p-value is given, I would give full marks > (perhaps, depending on how much the problem is worth overall, > subtracting 1 out of 10 marks for the nonsensical form of Ha). > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
forget the statement of the null build a CI ... perhaps 99% (which would correspond to your .01 sig. test) ... let that help to determine if the claim seems reasonable or not in this case ... p hat = .85 .. thus q hat = .15 stan error of a proportion (given SRS was done) is about stan error of p hats = sqrt ((p hat * q hat) / n) = sqrt (.85 * .15 / 200) = about .025 approximate 99% CI would be about p hat +/- 2.58 * .025 = .85 +/- .06 CI would be about .79 to .91 ... so, IF you insist on a hypothesis test ... retain the null personally, i would rather say that the pop. proportion might be between (about) .79 to .91 ... doesn't hold me to .9 problem here is that if you have opted for .05 ... you would have rejected ... (just barely) At 02:39 PM 11/29/01 -0500, you wrote: >On a quiz, I set the following problem to my statistics class: > >"The manufacturer of a patent medicine claims that it is 90% >effective(*) in relieving an allergy for a period of 8 hours. In a >sample of 200 people who had the allergy, the medicine provided >relief for 170 people. Determine whether the manufacturer's claim >was legitimate, to the 0.01 significance level." > >(The problem was adapted from Spiegel and Stevens, /Schaum's >Outline: Statistics/, problem 10.6.) > > >I believe a one-tailed test, not a two-tailed test, is appropriate. >It would be silly to test for "effectiveness differs from 90%" since >no one would object if the medicine helps more than 90% of >patients.) > >Framing the alternative hypothesis as "the manufacturer's claim is >not legitimate" gives > Ho: p >= .9; Ha: p < .9; p-value = .0092 >on a one-tailed t-test. Therefore we reject Ho and conclude that the >drug is less than 90% effective. > >But -- and in retrospect I should have seen it coming -- some >students framed the hypotheses so that the alternative hypothesis >was "the drug is effective as claimed." They had > Ho: p <= .9; Ha: p > .9; p-value = .9908. > >Now as I understand things it is not formally legitimate to accept >the null hypothesis: we can only either reject it (and accept Ha) or >fail to reject it (and draw no conclusion). What I would tell my >class is this: the best we can say is that p = .9908 is a very >strong statement that rejecting the null hypothesis would be a Type >I error. But I'm not completely easy in my mind about that, when >simply reversing the hypotheses gives p = .0092 and lets us conclude >that the drug is not 90% effective. > >There seems to be a paradox: The very same data lead either to the >conclusion "the drug is not effective as claimed" or to no >conclusion. I could certainly tell my class: "if it makes sense in >the particular situation, reverse the hypotheses and recompute the >p-value." Am I being over-formal here, or am I being horribly stupid >and missing some reason why it _would_ be legitimate to draw a >conclusion from p=.9908? > >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com >My reply address is correct as is. The courtesy of providing a correct >reply address is more important to me than time spent deleting spam. > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Stan Brown wrote: > On a quiz, I set the following problem to my statistics class: > > "The manufacturer of a patent medicine claims that it is 90% > effective(*) in relieving an allergy for a period of 8 hours. In a > sample of 200 people who had the allergy, the medicine provided > relief for 170 people. Determine whether the manufacturer's claim > was legitimate, to the 0.01 significance level." > > (The problem was adapted from Spiegel and Stevens, /Schaum's > Outline: Statistics/, problem 10.6.) > > I believe a one-tailed test, not a two-tailed test, is appropriate. > It would be silly to test for "effectiveness differs from 90%" since > no one would object if the medicine helps more than 90% of > patients.) > > Framing the alternative hypothesis as "the manufacturer's claim is > not legitimate" gives > Ho: p >= .9; Ha: p < .9; p-value = .0092 > on a one-tailed t-test. Therefore we reject Ho and conclude that the > drug is less than 90% effective. > > But -- and in retrospect I should have seen it coming -- some > students framed the hypotheses so that the alternative hypothesis > was "the drug is effective as claimed." They had > Ho: p <= .9; Ha: p > .9; p-value = .9908. I don't understand where they get the .9908 from. Whether you test a one-or a two-sided alternative, the test statistic is the same. So the p-value for the two-sided version of the test should be simply twice the p-value for the one-sided alternative, 0.0184. Hence the paradox you speak of is an illusion. Unfortunately for you, the two versions of the test lead to different conclusions. If the correct p-value is given, I would give full marks (perhaps, depending on how much the problem is worth overall, subtracting 1 out of 10 marks for the nonsensical form of Ha). = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Interpreting p-value = .99
On a quiz, I set the following problem to my statistics class: "The manufacturer of a patent medicine claims that it is 90% effective(*) in relieving an allergy for a period of 8 hours. In a sample of 200 people who had the allergy, the medicine provided relief for 170 people. Determine whether the manufacturer's claim was legitimate, to the 0.01 significance level." (The problem was adapted from Spiegel and Stevens, /Schaum's Outline: Statistics/, problem 10.6.) I believe a one-tailed test, not a two-tailed test, is appropriate. It would be silly to test for "effectiveness differs from 90%" since no one would object if the medicine helps more than 90% of patients.) Framing the alternative hypothesis as "the manufacturer's claim is not legitimate" gives Ho: p >= .9; Ha: p < .9; p-value = .0092 on a one-tailed t-test. Therefore we reject Ho and conclude that the drug is less than 90% effective. But -- and in retrospect I should have seen it coming -- some students framed the hypotheses so that the alternative hypothesis was "the drug is effective as claimed." They had Ho: p <= .9; Ha: p > .9; p-value = .9908. Now as I understand things it is not formally legitimate to accept the null hypothesis: we can only either reject it (and accept Ha) or fail to reject it (and draw no conclusion). What I would tell my class is this: the best we can say is that p = .9908 is a very strong statement that rejecting the null hypothesis would be a Type I error. But I'm not completely easy in my mind about that, when simply reversing the hypotheses gives p = .0092 and lets us conclude that the drug is not 90% effective. There seems to be a paradox: The very same data lead either to the conclusion "the drug is not effective as claimed" or to no conclusion. I could certainly tell my class: "if it makes sense in the particular situation, reverse the hypotheses and recompute the p-value." Am I being over-formal here, or am I being horribly stupid and missing some reason why it _would_ be legitimate to draw a conclusion from p=.9908? -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =