Re: questions on hypothesis
Herman Rubin wrote: > and until recently, > scientists believed that their models could be exactly right. but, as you wrote in another context -- 3 Oct 1998 08:07:23 -0500; Message-ID:6v57ib$[EMAIL PROTECTED] "Normality is rarely a tenable hypothesis. Its usefulness as a means of deriving procedures is that it is often the case, as in regression, that the resulting procedure is robust in the sense of having desirable properties without it, while nothing better can be done uniformly." - = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Robert J. MacG. Dawson <[EMAIL PROTECTED]> wrote: >[EMAIL PROTECTED] wrote (in part): <> I'm saying that the entire concept of practical significance is not only <> subjective, but limited to the extent of current knowledge. You may <> regard a 0.01% effect at this point in time as a trivial and (virtually) <> artifactual byproduct of hypothesis testing. But if proper controls are <> in place, then to do so is tantamount to ignoring an effect that, on the <> balance of probabilities shouldn't be there if all things were equal. I <> think we need to be cautious in ascribing effects as having little <> practical significance and hence using this as an argument against <> hypothesis testing. > "Practical significance" is relevant if and only if there is some >"practice" involved - that is to say, if a real-world decision is going >to be based on the data. Such a decision _must_ be based on current >knowledge, for want of any other; but if the data are preserved, a >different decision can be based on them in the future if more is known >then. > (BTW: If a decision *is* to be made, a risk/benefit approach would seem >more appropriate. Yes, it probably involves subjective decisions; but >using fixed-level hypothesis testing to avoid that is a little like >saying "as I might not choose exactly the right size of screwdriver I >shall hit the screw with a hammer". If we do take the risks and >benefits into account in "choosing a p-value", we are not really doing a >classical hypothesis test, even though the calculations may coincide.) > However, if a real-world decision is *not* going to be made, there is >usually no need to fit the interpretation of marginal data into the >Procrustean bed of dichotomous interpretation (which is the >_raison_d'etre_ of the hypothesis test). Until there is overwhelming >data one way or the other, our knowledge of the situation is in shades >of gray, and representing it in black and white is a loss of >information. This does not seem to be the way that anything is presented in the scientific literature. From the standpoint of collecting information, p-values are of little, if any, value, as they contribute little to being able to compute, or even approximate, the likelihood function, which contains the information in the data. The use of p-values is a carryover from the mistaken "alchemy" period of statistics. and it has always been misinterpreted, even by the good ones. They tried for answers, before the appropriate questions had been asked, and until recently, scientists believed that their models could be exactly right. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
On Thu, 19 Oct 2000 [EMAIL PROTECTED] wrote: > In article <[EMAIL PROTECTED]>, > Peter Lewycky <[EMAIL PROTECTED]> wrote: > > I've often been called upon to do a t-test with 5 animals in one > > group and 4 animals in the other. The power is abysmally low and > > rarely do I get a p less than 0.05. One of the difficulties that > > medical researchers have is with the notion of power and concomitant > > sample size. I make it a point of calculating power especially > > where Ho has not been rejected. It gives the researcher some comfort > > in that his therapy may indeed be effective. All he needs for 0.8 > > power is 28,141 rats per group. > > This has got to be one of the funniest things I have read on a stats > newsgroup. I'm sure its not really meant to be funny, Dunno why you'd be so certain of that. I've known Peter for a while, and certainly would not characterize him as lacking a sense of humour... > but the thought of truckloads upon truckload of rats arriving to > satisfy power requirements puts a highly amusing spin on the whole > thing. :) > I am stifling an insane cackle because I know statistics is a serious > business but really It may, sometimes, be a serious business; but that's not to say that one should _take_ it seriously. -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Thom Baguley wrote: > > Robert J. MacG. Dawson wrote: > > > [EMAIL PROTECTED] wrote: > > > > > > In article <[EMAIL PROTECTED]>, > > > Jerry Dallal <[EMAIL PROTECTED]> wrote: > > > > > > > (1) statistical significance usually is unrelated to practice > > > > importance. > > > > > > I don't think so. I can think of many examples in which statistical > > > inference plays an invaluable role in practical applications and > > > instrumentation, or indeed any "practical" application of a theory etc. > > > Not just in science, but engineering, e.g aircraft design, studying the > > > brain, electrical enginerring. Certainly there are examples of > > > statistical nonsense, e.g. polls, but i wouldn't go so far as to say it > > > is usually like this. > > > > Chris: That's not what Jerry means. What he's saying is that if your > > sample size is large enough, a difference may be statistically > > significant (a term which has a very precise meaning, especially to the > > Apostles of the Holy 5%) but not large enough to be practically > > important. [A hypothetical very large sample might show, let us say, > > that a very expensive diet supplement reduced one's chances of a heart > > attack by 1/10 of 1%.] Alternatively, in an imperfectly-controlled > > study, it may show an effect that - whether large enough to be of > > interest or not - is too small to ascribe a cause to. [A moderately > > large study might show that some ethnic group has a 1% higher rate of > > heart attacks, with amargin of error of +- .2% . But we might have, for > > an effect of this size, no way of telling whether it's due to genes, > > diet, socioeconomic factors, recreational drugs, or whatever.] > > I'd add that I think Jerry meant "unrelated" in the sense of independent rather > than irrelevant (Jerry can correct me if I'm wrong). You can get important > significant effects, unimportant significant effects, important non-significant > effects and unimportant non-significant effects. > > For what its worth, practical importantance also depends on many factors other > than effect size. These include mutability, generalizabilty, cost, and so on. > > Thom Nothing to correct. You and Robert explained it fine. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
[EMAIL PROTECTED] wrote: > > In article <[EMAIL PROTECTED]>, > Jerry Dallal <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote: > > > > > I > > > said before, I don't think this can be seen as a problem with > hypothesis > > > testing; but it is a matter for hypothesis *testers*. > > > > Nothing wrong with this, but it might be a good time to review the > > question that started this thread, namely, > > > > "What are the limitations of hypothesis testing using significance > > tests > > based on p-values?" > > Has the thread really wandered that much? > My argument is basically that the misuse of hypothesis testing, from > which most of the difficulties appear to arise, shouldn't be seen as a > *limitation* of hypothesis testing. it just doesn't seem logical. I read the question as saying, "There are hypotheses to be tested. What are the limitations of using significance tests based on P values to do this?" I stand by my response. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
>This has got to be one of the funniest things I have read on a stats >newsgroup. I'm sure its not really meant to be funny, but the thought >of truckloads upon truckload of rats arriving to satisfy power >requirements puts a highly amusing spin on the whole thing. :) >I am stifling an insane cackle because I know statistics is a serious >business but really > >Cheers, >Chris whether you need 21,000 rats ... or 45 ... or anything below or above ... depends to a large extent on what kind of impact the treatment has ... sure, if the treatment or experimental condition has but a trivial (but real) effect ... then n has to be relatively large ... whether it be rats or humans ... but if the impact is rather large ... you don't need 21,000 EVER many eons ago ... i was doing some studies comparing Ss who had access to calculators and Ss who did not ... on the solution of statistical problems and, i measured things like #correct ... and time to completion ... and a ratio of the two which i called efficiency now, for the time measure ... the difference was so large ... i could have detected this difference with ns of 3 or 4 in each group ... NO problem ... so, it all depends = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Peter Lewycky <[EMAIL PROTECTED]> wrote: > I've often been called upon to do a t-test with 5 animals in one group > and 4 animals in the other. The power is abysmally low and rarely do I > get a p less than 0.05. One of the difficulties that medical researcher > have is with the notion of power and concomitant sample size. I make it > a point of calculating power especially where Ho has not been rejected. > It gives the researcher some comfort in that his therapy may indeed be > effective. All he needs for 0.8 power is 28,141 rats per group. This has got to be one of the funniest things I have read on a stats newsgroup. I'm sure its not really meant to be funny, but the thought of truckloads upon truckload of rats arriving to satisfy power requirements puts a highly amusing spin on the whole thing. :) I am stifling an insane cackle because I know statistics is a serious business but really Cheers, Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
I've often been called upon to do a t-test with 5 animals in one group and 4 animals in the other. The power is abysmally low and rarely do I get a p less than 0.05. One of the difficulties that medical researcher have is with the notion of power and concomitant sample size. I make it a point of calculating power especially where Ho has not been rejected. It gives the researcher some comfort in that his therapy may indeed be effective. All he needs for 0.8 power is 28,141 rats per group. [EMAIL PROTECTED] wrote: > > In article <[EMAIL PROTECTED]>, > [EMAIL PROTECTED] (dennis roberts) wrote: > > > > > thus, the idea is that 5% and/or 1% were "chosen" due to the tables > that > > were available and not, some logical reasoning for these values? > > > > i don't see any logic to the notion that 5% and/or 1% ... have any > special > > nor simplification properties compared to say ... 9% or 3% > > > > given that it appears that these same values apply today ... that is, > we > > have been in a "stuck" mode for all these years ... is not very > comforting > > given that 5% and/or 1% were opted for because someone had worked out > these > > columns in a table > > I agree, and I think perhaps that although the original work focused on > the 5% and 1% levels for practical reasons, the tradition persists b/c > it provides a convenient criterion for journal editors in deciding > between 'important' and 'unimportant' findings. Consequently, to > increase the chances of being published, researchers sometimes resort to > terms like "highly significant" in referring to low p values, which is > really a quite nebulous statement (if not completely misleading- I shall > leave that determination to the experts). To me, it seems that less > emphasis on p values per se and more emphasis on power and effect size > would increase the general quality and replicability of published data. > > Chris > > Sent via Deja.com http://www.deja.com/ > Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Jerry Dallal <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > > I > > said before, I don't think this can be seen as a problem with hypothesis > > testing; but it is a matter for hypothesis *testers*. > > Nothing wrong with this, but it might be a good time to review the > question that started this thread, namely, > > "What are the limitations of hypothesis testing using significance > tests > based on p-values?" Has the thread really wandered that much? My argument is basically that the misuse of hypothesis testing, from which most of the difficulties appear to arise, shouldn't be seen as a *limitation* of hypothesis testing. it just doesn't seem logical. Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Thom Baguley <[EMAIL PROTECTED]> wrote: >Robert J. MacG. Dawson wrote: >> [EMAIL PROTECTED] wrote: >> > In article <[EMAIL PROTECTED]>, >> > Jerry Dallal <[EMAIL PROTECTED]> wrote: >> > > (1) statistical significance usually is unrelated to practice >> > > importance. >> > I don't think so. I can think of many examples in which statistical >> > inference plays an invaluable role in practical applications and >> > instrumentation, or indeed any "practical" application of a theory etc. >> > Not just in science, but engineering, e.g aircraft design, studying the >> > brain, electrical enginerring. Certainly there are examples of >> > statistical nonsense, e.g. polls, but i wouldn't go so far as to say it >> > is usually like this. >> Chris: That's not what Jerry means. What he's saying is that if your >> sample size is large enough, a difference may be statistically >> significant (a term which has a very precise meaning, especially to the >> Apostles of the Holy 5%) but not large enough to be practically >> important. [A hypothetical very large sample might show, let us say, >> that a very expensive diet supplement reduced one's chances of a heart >> attack by 1/10 of 1%.] Alternatively, in an imperfectly-controlled >> study, it may show an effect that - whether large enough to be of >> interest or not - is too small to ascribe a cause to. [A moderately >> large study might show that some ethnic group has a 1% higher rate of >> heart attacks, with amargin of error of +- .2% . But we might have, for >> an effect of this size, no way of telling whether it's due to genes, >> diet, socioeconomic factors, recreational drugs, or whatever.] >I'd add that I think Jerry meant "unrelated" in the sense of independent rather >than irrelevant (Jerry can correct me if I'm wrong). You can get important >significant effects, unimportant significant effects, important non-significant >effects and unimportant non-significant effects. >For what its worth, practical importantance also depends on many factors other >than effect size. These include mutability, generalizabilty, cost, and so on. This is another reason for not doing something as bad as significance tests. It has been argued that there may be many possible situations for action based on observations, and that the observations need to be summarized so that subsequent investigators can incorporate the studies. But the significance level, or the p-value, does not provide this summary; the likelihood function does. Other than the rather ridiculous statement of exactly what is accomplished by p-values, of what use is it, except religion? -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Thom Baguley <[EMAIL PROTECTED]> wrote: >> You can get important significant effects, unimportant significant >> effects, important non-significant effects and unimportant >> non-significant effects. Radford Neal wrote: >I'll go for three out of four of these. But "important non-significant >effects"? > >That would be like saying "I think the benefits of this drug are large >enough to be important, even though I'm not convinced that it has any >benefit at all". Richard M. Barton <[EMAIL PROTECTED]> wrote: > ***I disagree. It could indicate lack of power. If your alpha > level had been higher, or if you had more subjects, you might have > found statistically significant results. Yes, if you did an experiment using more subjects you MIGHT obtain convincing evidence that the drug really does have a benefit. Or you might not. This is no different from what you could have said even before you did the first experiment. This POSSIBILITY doesn't justify saying that you found an "important but non-significant" effect. If you're trying to say that the experiment produced some evidence of a benefit, and that this evidence is enough to persuade you to recommend use of the drug, even though it's still possible that there is no real benefit, then I think that p-values are too crude a tool for what you want to do. You need to use Bayesian decision theory. Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (dennis roberts) wrote: > > thus, the idea is that 5% and/or 1% were "chosen" due to the tables that > were available and not, some logical reasoning for these values? > > i don't see any logic to the notion that 5% and/or 1% ... have any special > nor simplification properties compared to say ... 9% or 3% > > given that it appears that these same values apply today ... that is, we > have been in a "stuck" mode for all these years ... is not very comforting > given that 5% and/or 1% were opted for because someone had worked out these > columns in a table I agree, and I think perhaps that although the original work focused on the 5% and 1% levels for practical reasons, the tradition persists b/c it provides a convenient criterion for journal editors in deciding between 'important' and 'unimportant' findings. Consequently, to increase the chances of being published, researchers sometimes resort to terms like "highly significant" in referring to low p values, which is really a quite nebulous statement (if not completely misleading- I shall leave that determination to the experts). To me, it seems that less emphasis on p values per se and more emphasis on power and effect size would increase the general quality and replicability of published data. Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Jerry Dallal <[EMAIL PROTECTED]> wrote: >Many posters to this thread have used the phrase "practical >significance". I find it only confuses things. Just so all of us >are >clear on what we're talking about, might we restrict ourselves to >the terms "statistical signficance" and "practical importance"? As most people consider statistical significance to be a measure of importance, I think practical significance should be maintained. The bad term is "statistical significance". -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <8sill5$gvf$[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> wrote: >In article <[EMAIL PROTECTED]>, > [EMAIL PROTECTED] (Robert J. MacG. Dawson) wrote: . >> Fair enough: but I would argue that the right question is rarely "if >> there were no effect whatsoever, and the following model applied, what >> is the probility that we would observe a value of the following >> statistic at least as great as what was observed?" and hence that a >> hypothesis test is rarely the right way to obtain the right answer. >> Hypothesis testing does what it sets out to do perfectly well- the >> only question, in most cases, is why one would want that done. >I agree with this. From what I gauge from your rephrasing of the >research question, there seems to be no reason why most research >questions could not be phrased in this manner. Rather, it seems that the >problems with hypothesis testing result from people misusing it. Like I >said before, I don't think this can be seen as a problem with hypothesis >testing; but it is a matter for hypothesis *testers*. I disagree. This may be the case for questions of philosophical belief, but not for action, and publishing an article, or even discussion with colleagues, is action. Robert Dawson is quite right; few who understand what hypothesis testing actually is doing would use it. Those who started out using it, more than two centuries ago, had the mistaken belief that the significance level was, if not the probability of the truth of the hypothesis, at least a good indication of that. The situation, however, is generally that the hypothesis, stated in terms of the distribution of the observations, is at least almost always false. So why should the probability that we would observe a value of the statistic at least as great as what was observed, from a model which we would not believe anyhow, even be of importance? This does not mean that we should not do hypothesis testing. The null hypothesis might well be the best useful approximation available, given the observations. A more accurate model need not be more useful. One must consider all the consequences. >> Fair enough... I do not argue with your support of proper controls. >> However, in the real world, insisting on this would be tantamount to >> ending experimental research in the social sciences and many >> disciplines within the life sciences. (You may draw your own >> conclusions as the advisability of this >Certainly, one could argue that anyone who wants to test a hypothesis >needs to adhere to same guidelines. The fact that this frequently >doesn't happen is, again, the fault of people not principles. One quick >glance at the social psychology literature, for example, reveals a >history replete with low power, inadequate controls and spurious >conclusions based on doubtful stats. (I'm going to annoy somebody here I >just know it ). One must also consider the consequences of the action in other states of nature. Starting out with classical statistics makes it much harder to consider the full problem. Hypothesis testing has become a religion. The belief that there must be something this simple is what is driving it. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
"Richard M. Barton" wrote: > > --- Radford Neal wrote: > In article <[EMAIL PROTECTED]>, > Thom Baguley <[EMAIL PROTECTED]> wrote: > > > You can get important significant effects, unimportant significant > > effects, important non-significant effects and unimportant > > non-significant effects. > > I'll go for three out of four of these. But "important non-significant > effects"? > > That would be like saying "I think the benefits of this drug are large > enough to be important, even though I'm not convinced that it has any > benefit at all". Roughly: and if you agree that "convinced" is stronger than "think" there is no contradiction here. My guess is that early in the development of new drugs this is often an accurate description of the researcher's attitude, and the correct response is to do more research. However, a better phrasing might be "I think the benefits of this drug _might_turn_out_to_be_ large enough to be important, even though I'm not _yet_ convinced that it has any benefit at all". In other words, a reasonable interval estimate for the effect size contains some values of interest and we need more data. (We do need some other evidence that these values are plausible, of course; we cannot go haring off after every conjecture we can't disprove!) -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
--- Radford Neal wrote: In article <[EMAIL PROTECTED]>, Thom Baguley <[EMAIL PROTECTED]> wrote: > You can get important significant effects, unimportant significant > effects, important non-significant effects and unimportant > non-significant effects. I'll go for three out of four of these. But "important non-significant effects"? That would be like saying "I think the benefits of this drug are large enough to be important, even though I'm not convinced that it has any benefit at all". ***I disagree. It could indicate lack of power. If your alpha level had been higher, or if you had more subjects, you might have found statistically significant results. Of course, real conclusions are not black-and-white. We might not be convinced that the drug has an effect, but the benefit if it does might be so large that we'll use it on the off-chance that it does have an effect. But if you're using the black-and-white language of "significant" versus "not significant", it makes no sense to say that an effect is "important but not significant". Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = --- end of quote --- Richard Barton, Statistical Consultant Dartmouth College Peter Kiewit Computing Services 6224 Baker/Berry Hanover, NH 03755 (603)-646-0255 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Thom Baguley <[EMAIL PROTECTED]> wrote: > You can get important significant effects, unimportant significant > effects, important non-significant effects and unimportant > non-significant effects. I'll go for three out of four of these. But "important non-significant effects"? That would be like saying "I think the benefits of this drug are large enough to be important, even though I'm not convinced that it has any benefit at all". Of course, real conclusions are not black-and-white. We might not be convinced that the drug has an effect, but the benefit if it does might be so large that we'll use it on the off-chance that it does have an effect. But if you're using the black-and-white language of "significant" versus "not significant", it makes no sense to say that an effect is "important but not significant". Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Robert J. MacG. Dawson wrote: > [EMAIL PROTECTED] wrote: > > > > In article <[EMAIL PROTECTED]>, > > Jerry Dallal <[EMAIL PROTECTED]> wrote: > > > > > (1) statistical significance usually is unrelated to practice > > > importance. > > > > I don't think so. I can think of many examples in which statistical > > inference plays an invaluable role in practical applications and > > instrumentation, or indeed any "practical" application of a theory etc. > > Not just in science, but engineering, e.g aircraft design, studying the > > brain, electrical enginerring. Certainly there are examples of > > statistical nonsense, e.g. polls, but i wouldn't go so far as to say it > > is usually like this. > > Chris: That's not what Jerry means. What he's saying is that if your > sample size is large enough, a difference may be statistically > significant (a term which has a very precise meaning, especially to the > Apostles of the Holy 5%) but not large enough to be practically > important. [A hypothetical very large sample might show, let us say, > that a very expensive diet supplement reduced one's chances of a heart > attack by 1/10 of 1%.] Alternatively, in an imperfectly-controlled > study, it may show an effect that - whether large enough to be of > interest or not - is too small to ascribe a cause to. [A moderately > large study might show that some ethnic group has a 1% higher rate of > heart attacks, with amargin of error of +- .2% . But we might have, for > an effect of this size, no way of telling whether it's due to genes, > diet, socioeconomic factors, recreational drugs, or whatever.] I'd add that I think Jerry meant "unrelated" in the sense of independent rather than irrelevant (Jerry can correct me if I'm wrong). You can get important significant effects, unimportant significant effects, important non-significant effects and unimportant non-significant effects. For what its worth, practical importantance also depends on many factors other than effect size. These include mutability, generalizabilty, cost, and so on. Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
[EMAIL PROTECTED] wrote: > I > said before, I don't think this can be seen as a problem with hypothesis > testing; but it is a matter for hypothesis *testers*. Nothing wrong with this, but it might be a good time to review the question that started this thread, namely, "What are the limitations of hypothesis testing using significance tests based on p-values?" = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Many posters to this thread have used the phrase "practical significance". I find it only confuses things. Just so all of us are clear on what we're talking about, might we restrict ourselves to the terms "statistical signficance" and "practical importance"? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
[EMAIL PROTECTED] wrote (in part): > I'm saying that the entire concept of practical significance is not only > subjective, but limited to the extent of current knowledge. You may > regard a 0.01% effect at this point in time as a trivial and (virtually) > artifactual byproduct of hypothesis testing. But if proper controls are > in place, then to do so is tantamount to ignoring an effect that, on the > balance of probabilities shouldn't be there if all things were equal. I > think we need to be cautious in ascribing effects as having little > practical significance and hence using this as an argument against > hypothesis testing. "Practical significance" is relevant if and only if there is some "practice" involved - that is to say, if a real-world decision is going to be based on the data. Such a decision _must_ be based on current knowledge, for want of any other; but if the data are preserved, a different decision can be based on them in the future if more is known then. (BTW: If a decision *is* to be made, a risk/benefit approach would seem more appropriate. Yes, it probably involves subjective decisions; but using fixed-level hypothesis testing to avoid that is a little like saying "as I might not choose exactly the right size of screwdriver I shall hit the screw with a hammer". If we do take the risks and benefits into account in "choosing a p-value", we are not really doing a classical hypothesis test, even though the calculations may coincide.) However, if a real-world decision is *not* going to be made, there is usually no need to fit the interpretation of marginal data into the Procrustean bed of dichotomous interpretation (which is the _raison_d'etre_ of the hypothesis test). Until there is overwhelming data one way or the other, our knowledge of the situation is in shades of gray, and representing it in black and white is a loss of information. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
At 05:38 PM 10/17/00 -0700, David Heiser wrote: >The 5% is a historical arifact, the result of statistics being invented >before electronic computers were invented. an artifact is some anomaly of the data ... but, how could 5% be considered an artifact DUE to the lack of electronic computers? >The work in the early 1900's was severely restricted by the fact that >computations of the cummulative probability distribution involved tedious >paper and pencil calculations, and later on the use of mechanical >calculators. Available tables only gave the values for 5% and in some cases >1%. thus, the idea is that 5% and/or 1% were "chosen" due to the tables that were available and not, some logical reasoning for these values? i don't see any logic to the notion that 5% and/or 1% ... have any special nor simplification properties compared to say ... 9% or 3% given that it appears that these same values apply today ... that is, we have been in a "stuck" mode for all these years ... is not very comforting given that 5% and/or 1% were opted for because someone had worked out these columns in a table = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
- Original Message - From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, October 16, 2000 4:24 PM Subject: Re: questions on hypothesis > In article <[EMAIL PROTECTED]>, > > Chris: That's not what Jerry means. What he's saying is that if > > your sample size is large enough, a difference may be statistically > > significant (a term which has a very precise meaning, especially to > > the Apostles of the Holy 5%) but not large enough to be practically > > important. [A hypothetical very large sample might show, let us say, > > that a very expensive diet supplement reduced one's chances of a heart > > attack by 1/10 of 1%.] > > Firstly, I think we can thank publication pressures for the church of > the Holy 5%. I go with Keppel's approach in suspending judgement for mid > range significance levels (although we should do this for nonsignificant > results anyway as they are inherently indeterminant). - The 5% is a historical arifact, the result of statistics being invented before electronic computers were invented. The work in the early 1900's was severely restricted by the fact that computations of the cummulative probability distribution involved tedious paper and pencil calculations, and later on the use of mechanical calculators. Available tables only gave the values for 5% and in some cases 1%. R.A. Fisher in his publications consistently referred to values well below 1% as being "convincing". To illustrate the fundamental test methods, he had to rely on available tables and chose 5% in most of his examples. However he did not consider 5% as being "scientifically convincing". DAH = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (Robert J. MacG. Dawson) wrote: > > > > Wrt to your example, it seems that the decision you are making about > > practical importance is purely subjective. > > What exactly do you mean by this? Are you saying that _my_ > example is purely subjective but that others are not, or that the > entire concept of practical significance is subjective? And, if so, so > what? Does it then follow that it is more "scientific" to ignore it > entirely? I'm saying that the entire concept of practical significance is not only subjective, but limited to the extent of current knowledge. You may regard a 0.01% effect at this point in time as a trivial and (virtually) artifactual byproduct of hypothesis testing. But if proper controls are in place, then to do so is tantamount to ignoring an effect that, on the balance of probabilities shouldn't be there if all things were equal. I think we need to be cautious in ascribing effects as having little practical significance and hence using this as an argument against hypothesis testing. > Fair enough: but I would argue that the right question is rarely "if > there were no effect whatsoever, and the following model applied, what > is the probility that we would observe a value of the following > statistic at least as great as what was observed?" and hence that a > hypothesis test is rarely the right way to obtain the right answer. > Hypothesis testing does what it sets out to do perfectly well- the > only question, in most cases, is why one would want that done. I agree with this. From what I gauge from your rephrasing of the research question, there seems to be no reason why most research questions could not be phrased in this manner. Rather, it seems that the problems with hypothesis testing result from people misusing it. Like I said before, I don't think this can be seen as a problem with hypothesis testing; but it is a matter for hypothesis *testers*. > Fair enough... I do not argue with your support of proper controls. > However, in the real world, insisting on this would be tantamount to > ending experimental research in the social sciences and many > disciplines within the life sciences. (You may draw your own > conclusions as the advisability of this Certainly, one could argue that anyone who wants to test a hypothesis needs to adhere to same guidelines. The fact that this frequently doesn't happen is, again, the fault of people not principles. One quick glance at the social psychology literature, for example, reveals a history replete with low power, inadequate controls and spurious conclusions based on doubtful stats. (I'm going to annoy somebody here I just know it ). > - I will venture an opinion that it ain't a-gonna happen, advisable or > no.) There are always more experimental variables than we can control > for, and there are often explanatory variables of interest that it > would be impossible (eg, ethnic background - unless we can emulate the > aliens on the Monty Python episode who could turn people into > Scotsmen!) or unethical to randomize. The best that one can hope to > do in such situations is control for nuisance variables whose effects > judged likely to produce a large effect, and accept that any small > effect is of unknowable origin. I fully agree, although I would amend unknowable origin to _presently_ unknowable origin. And I think this really hits the core of the issue: small effects, no matter where they come from often turn out to be big effects (or disappears entirely) when greater knowledge allows us to refine proper control conditions. I think that is a valuable asset of hypothesis testing. It demands stringent adherence by its users but it rewards vigilance. Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, dennis roberts <[EMAIL PROTECTED]> wrote: >At 10:06 PM 10/16/00 +, Peter Lewycky wrote: >>It happens all the time in medicine. If I can show a p value 0.05 or >>less the researchers are delighted. Whenever I can't produce a p of 0.05 >>or less they start looking for another statistician and will even >>withhold a paper from publication. >gee ... this is too bad ... someone has sold all these folks in medicine a >bill of goods ... >some possibilities are: >1. those in medicine are not really taking any statistics courses >2. those in medicine are not really reading statistical material very carefully >3. those in medicine have had a bad run of luck WHEN taking data analysis >courses It is not just in medicine, but you will find that most of those who have taken a statistical methods course will have this attitude. Furthermore, it is hard for most to get over this. Many never do. Another in this line is the belief that things should be normally distributed. Some investigators designed an IQ test with a report ceiling, because they did not have enough subjects to use the normal distribution to evaluate higher IQs. This IS what people in the applied fields have "learned" in their statistics courses. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
[EMAIL PROTECTED] wrote: > > In article <[EMAIL PROTECTED]>, > > Chris: That's not what Jerry means. What he's saying is that if > > your sample size is large enough, a difference may be statistically > > significant (a term which has a very precise meaning, especially to > > the Apostles of the Holy 5%) but not large enough to be practically > > important. [A hypothetical very large sample might show, let us say, > > that a very expensive diet supplement reduced one's chances of a heart > > attack by 1/10 of 1%.] > > Firstly, I think we can thank publication pressures for the church of > the Holy 5%. I go with Keppel's approach in suspending judgement for mid > range significance levels (although we should do this for nonsignificant > results anyway as they are inherently indeterminant). > > Wrt to your example, it seems that the decision you are making about > practical importance is purely subjective. What exactly do you mean by this? Are you saying that _my_ example is purely subjective but that others are not, or that the entire concept of practical significance is subjective? And, if so, so what? Does it then follow that it is more "scientific" to ignore it entirely? In any number of alternative > situations a .01% effect could have major implications, practical and > theoretical. It might or it might not. I was referring to a hypothetical situation in which it seemed reasonable to suppose that it didn't. I regard this as less a fundamental flaw with hypothesis > testing and more a question of expermental design and asking the right > questions to begin with. Fair enough: but I would argue that the right question is rarely "if there were no effect whatsoever, and the following model applied, what is the probility that we would observe a value of the following statistic at least as great as what was observed?" and hence that a hypothesis test is rarely the right way to obtain the right answer. Hypothesis testing does what it sets out to do perfectly well- the only question, in most cases, is why one would want that done. > >Alternatively, in an imperfectly-controlled > > study, it may show an effect that - whether large enough to be of > > interest or not - is too small to ascribe a cause to. [A moderately > > large study might show that some ethnic group has a 1% higher rate of > > heart attacks, with amargin of error of +- .2% . But we might have, or > > an effect of this size, no way of telling whether it's due to genes, > > diet, socioeconomic factors, recreational drugs, or whatever.] > > Surely the ambiguity of this outcome is the result of the lack of > experimental control. If the effects of genetics, diet etc. are not > appropriately controlled, it doesn't matter what sample size is > used - the outcome will be always be equivocal. What it does suggest is > that, irrespective of sample size, we must be vigilant in controlling > for extraneous variables. Is it fair to consider this a flaw of > hypothesis testing? We can hardly blame the tools for not working > properly if they are not used correctly. Fair enough... I do not argue with your support of proper controls. However, in the real world, insisting on this would be tantamount to ending experimental research in the social sciences and many disciplines within the life sciences. (You may draw your own conclusions as the advisability of this - I will venture an opinion that it ain't a-gonna happen, advisable or no.) There are always more experimental variables than we can control for, and there are often explanatory variables of interest that it would be impossible (eg, ethnic background - unless we can emulate the aliens on the Monty Python episode who could turn people into Scotsmen!) or unethical to randomize. The best that one can hope to do in such situations is control for nuisance variables whose effects judged likely to produce a large effect, and accept that any small effect is of unknowable origin. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
At 10:06 PM 10/16/00 +, Peter Lewycky wrote: >It happens all the time in medicine. If I can show a p value 0.05 or >less the researchers are delighted. Whenever I can't produce a p of 0.05 >or less they start looking for another statistician and will even >withhold a paper from publication. gee ... this is too bad ... someone has sold all these folks in medicine a bill of goods ... some possibilities are: 1. those in medicine are not really taking any statistics courses 2. those in medicine are not really reading statistical material very carefully 3. those in medicine have had a bad run of luck WHEN taking data analysis courses so, if someone in medicine looks at a paper with findings, and ... the p value is ok ... REGARDLESS OF THE DESIGN of the study or the way the investigation was carried out ... then the findings are meaningful ... but if the p value is less than that magical cutoff ... even if the study seems sound ... then it is not worthy of the time of day? == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, > Chris: That's not what Jerry means. What he's saying is that if > your sample size is large enough, a difference may be statistically > significant (a term which has a very precise meaning, especially to > the Apostles of the Holy 5%) but not large enough to be practically > important. [A hypothetical very large sample might show, let us say, > that a very expensive diet supplement reduced one's chances of a heart > attack by 1/10 of 1%.] Firstly, I think we can thank publication pressures for the church of the Holy 5%. I go with Keppel's approach in suspending judgement for mid range significance levels (although we should do this for nonsignificant results anyway as they are inherently indeterminant). Wrt to your example, it seems that the decision you are making about practical importance is purely subjective. In any number of alternative situations a .01% effect could have major implications, practical and theoretical.I regard this as less a fundamental flaw with hypothesis testing and more a question of expermental design and asking the right questions to begin with. >Alternatively, in an imperfectly-controlled > study, it may show an effect that - whether large enough to be of > interest or not - is too small to ascribe a cause to. [A moderately > large study might show that some ethnic group has a 1% higher rate of > heart attacks, with amargin of error of +- .2% . But we might have, or > an effect of this size, no way of telling whether it's due to genes, > diet, socioeconomic factors, recreational drugs, or whatever.] Surely the ambiguity of this outcome is the result of the lack of experimental control. If the effects of genetics, diet etc. are not appropriately controlled, it doesn't matter what sample size is used - the outcome will be always be equivocal. What it does suggest is that, irrespective of sample size, we must be vigilant in controlling for extraneous variables. Is it fair to consider this a flaw of hypothesis testing? We can hardly blame the tools for not working properly if they are not used correctly. Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
It happens all the time in medicine. If I can show a p value 0.05 or less the researchers are delighted. Whenever I can't produce a p of 0.05 or less they start looking for another statistician and will even withhold a paper from publication. "Simon, Steve, PhD" wrote: > > In a post to EDSTAT-L, you wrote: > > >I believe you will find that most researchers in the sciences > >accept the p-value as religion. In the report of the recent > >British study on Type 2 diabetes, there was an effect which > >was stated as "unimportant" because the p-value was .052. > > Do you have a citation for this. It sounds like an excellent teaching > example. > > Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. > STATS: STeve's Attempt to Teach Statistics. http://www.cmh.edu/stats > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: questions on hypothesis
In a post to EDSTAT-L, you wrote: >I believe you will find that most researchers in the sciences >accept the p-value as religion. In the report of the recent >British study on Type 2 diabetes, there was an effect which >was stated as "unimportant" because the p-value was .052. Do you have a citation for this. It sounds like an excellent teaching example. Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. STATS: STeve's Attempt to Teach Statistics. http://www.cmh.edu/stats = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
[EMAIL PROTECTED] wrote: > > In article <[EMAIL PROTECTED]>, > Jerry Dallal <[EMAIL PROTECTED]> wrote: > > > (1) statistical significance usually is unrelated to practice > > importance. > > I don't think so. I can think of many examples in which statistical > inference plays an invaluable role in practical applications and > instrumentation, or indeed any "practical" application of a theory etc. > Not just in science, but engineering, e.g aircraft design, studying the > brain, electrical enginerring. Certainly there are examples of > statistical nonsense, e.g. polls, but i wouldn't go so far as to say it > is usually like this. Chris: That's not what Jerry means. What he's saying is that if your sample size is large enough, a difference may be statistically significant (a term which has a very precise meaning, especially to the Apostles of the Holy 5%) but not large enough to be practically important. [A hypothetical very large sample might show, let us say, that a very expensive diet supplement reduced one's chances of a heart attack by 1/10 of 1%.] Alternatively, in an imperfectly-controlled study, it may show an effect that - whether large enough to be of interest or not - is too small to ascribe a cause to. [A moderately large study might show that some ethnic group has a 1% higher rate of heart attacks, with amargin of error of +- .2% . But we might have, for an effect of this size, no way of telling whether it's due to genes, diet, socioeconomic factors, recreational drugs, or whatever.] > I *would* argue that without some method to determine the likelihood of > a difference b/w two conditions you have no chance of determining > practical importance at all. > > > (2) absence of evidence is not evidence of absence > > Everyone who has done elementary statistics is aware of this edict. But > what if your power is very high and/or you have very large N? I have > always found it surprising that we can't turn it around and develop a > probability that two groups are the same. In a frequentist philosophy, we are not allowed to do this, because the nature of the two populations has not been randomized in any well-defined way, so the concept of "probability" does not apply. The Bayesian approach, which permits probabilities to be assigned to statements about parameters, *does* allow us to answer such questions. However, it depends, in general, on the "prior distribution" of the parameters that you select. In many cases, this makes it hard to make definitive statements (though if you have a lot of data it may well be that all plausible priors produce similar posterior distributions). However, here - with continuous parameters - the probability that the parameters of two disjoint groups are _the_same_ is easy to compute - it's 0. Like the probability of two people being exactly the same height. If you want to ask, in a Bayesian framework, for the probability that two population parameters are equal to within some specified tolerance, go right ahead. Alternatively, within a frequentist framework, you can test the hypothesis that the absolute value of the difference is less than some specified level. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
On Sat, 14 Oct 2000 01:56:32 GMT, [EMAIL PROTECTED] wrote: < snip > > > (2) absence of evidence is not evidence of absence > > Everyone who has done elementary statistics is aware of this edict. But > what if your power is very high and/or you have very large N? I have > always found it surprising that we can't turn it around and develop a > probability that two groups are the same. Power or beta is surely > correlated with the certainty of this approach. > Chris, What you get when you "turn it around" is a set of confidence limits. The range of the limits may be arbitrarily narrow, as the N gets arbitrarily large. "Bioequivalence" is a live issue for the (U.S.) Food and Drug Administration. Is a generic version of a drug "the same" as the patented version? Back in the 1970s ( I think I have this straight), it was enough to have a "suitably powerful study" and fail to show that it is different. What was Officially acceptable was revised in the 1980s to use Confidence limits; and I think what ought to comprise acceptable studies is under discussion again, right now. (I say "officially" because it is my impression that actual decisions were made by committees, and were not held to that standard.) But look at how large an N it takes to show that 3% mortality for a treatment is different from 5%, or from 4% - just as the marginal test, never mind having POWER. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
- Original Message - From: Ting Ting <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, October 13, 2000 10:57 PM Subject: Re: questions on hypothesis > > > > A good example of a simple situation for which exact P values are > > unavailable is the Behrens-Fisher problem (testing the equality of > > normal means from normal populations with unequal variances). Some > > might say we have approximate solutions that are good enough. . I see this as an imprecise statement of a hypothesis. >From set theory, I can see several different logical constructs, each of which would arrive at a different probability distribution, and consequently different p values. It boils down to just what is the hypothesis on the generator of the data. Is it a statement of logical equality or the value of a difference function. Does sample "A" come from process "a" and sample "B" come from process "b", or do both samples come from process "c"? The problem is simplified when process "a" and process "b" are known. When process "a" and "b" are not known, we have that Fisher problem of defining a set of all "a" parameter values <= to a given p1 value and defining a set of all "b" parameter values <= to a given p2 value. When the processes are one parameter processes, every thing is straightforward. (Fisher in his book-set very nicely used one parameter distributions to illustrate his ideas.) However for a two parameter process, the Fisher-Berens problem states an equality (intersection) of mean values and a disjoint of variance values, which cannot be analytically combined (given the normal distribution function) in terms of a single p value. Consequently, one finds in the textbooks, all the different approaches to establish a "c" process, for which tests can be constructed to determine if "A" and "B" come from the process "c" or not. The hypothesis being tested is then based on process "c", not on the original idea. DAH = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
On Sat, 14 Oct 2000 [EMAIL PROTECTED] wrote, inter alia: > I *would* argue that without some method to determine the likelihood of > a difference b/w two conditions you have no chance of determining > practical importance at all. But hypothesis testing procedures do not establish any such likelihood. What they may establish is the likelihood of observing data like these, IF the null hypothesis be true. That is not "the likelihood of a difference between two [or more] conditions". > But what if your power is very high and/or you have very large N? I > have always found it surprising that we can't turn it around and > develop a probability that two groups are the same. Power or beta is > surely correlated with the certainty of this approach. Again, we cannot "determine a probability that two [or more] groups are the same". What we can do is determine the probability (beta) that we could NOT reject the null hypothesis, IF the true state of affairs be a specified degree of departure from the null hypothesis [of, presumably, no difference]. (Or, if you prefer, the probability (power) that we COULD reject the null, given that degree of departure from it.) -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, San <[EMAIL PROTECTED]> wrote: >Would there be some cases which the p-value are so difficult to find >that it's nearly impossible? Is this a kind of limitation to the >hypothesis testing using p-value? Is there any substitute for the >p-value? >Thx for ur reply. This is often the case, but I do not believe it is what is being referred to. One can have very low p-values and very little importance, and high p-values and great importance. When it comes to deciding what action to take, the p-value without other information may even be misleading. For example, suppose there are two treatments for a disease. One is significant at a p-value of .001, and the other gives a "nonsignificant" p-value of .2. From the data, I might very well prefer the one which is not significant. >Jerry Dallal wrote: >> I wrote: >> > (1) statistical significance usually is unrelated to practice >> > importance. >> I meant to type "practical importance". -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <8s8egf$n5f$[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> wrote: >In article <[EMAIL PROTECTED]>, > Jerry Dallal <[EMAIL PROTECTED]> wrote: >> (1) statistical significance usually is unrelated to practice >> importance. >I don't think so. I can think of many examples in which statistical >inference plays an invaluable role in practical applications and >instrumentation, or indeed any "practical" application of a theory etc. >Not just in science, but engineering, e.g aircraft design, studying the >brain, electrical enginerring. Certainly there are examples of >statistical nonsense, e.g. polls, but i wouldn't go so far as to say it >is usually like this. >I *would* argue that without some method to determine the likelihood of >a difference b/w two conditions you have no chance of determining >practical importance at all. >> (2) absence of evidence is not evidence of absence >Everyone who has done elementary statistics is aware of this edict. But >what if your power is very high and/or you have very large N? I have >always found it surprising that we can't turn it around and develop a >probability that two groups are the same. Power or beta is surely >correlated with the certainty of this approach. I believe you will find that most researchers in the sciences accept the p-value as religion. In the report of the recent British study on Type 2 diabetes, there was an effect which was stated as "unimportant" because the p-value was .052. The likelihood function contains all the information in the data for the purpose of making a decision. Without having extraneous information, like the sample size and which test is being performed, and more, the p-value cannot be obtained with any amount of work. And one needs even more to get the likelihood function from the p-value. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Ting Ting <[EMAIL PROTECTED]> wrote: >> A good example of a simple situation for which exact P values are >> unavailable is the Behrens-Fisher problem (testing the equality of >> normal means from normal populations with unequal variances). Some >> might say we have approximate solutions that are good enough. >would u pls give some more detail examples about this? >thx I can give you simple randomized procedures with easily computable exact p-values, and which only lose in degrees of freedom compared to known variances. Also, Linnik has shown the existence of non-randomized procedures which can do this. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Gene Gallagher wrote: > Can someone recommend a good book on the history of statistics, > especially one focusing on Fisher's accomplishments. Fisher's > contributions and prickly personality are dealt with tangen- > tially in Provine's wonderful biography of Sewall Wright. > Surely, Fisher has merited one or more scholarly biographies of > his own. See Box, Joan Fisher (1978), R. A. FISHER: THE LIFE OF A SCIENTIST. New York: John Wiley. Joan Fisher Box is Sir Ronald Aylmer Fisher's daughter. --- Donald B. Macnaughton MatStat Research Consulting Inc [EMAIL PROTECTED] Toronto, Canada --- = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
> > A good example of a simple situation for which exact P values are > unavailable is the Behrens-Fisher problem (testing the equality of > normal means from normal populations with unequal variances). Some > might say we have approximate solutions that are good enough. > would u pls give some more detail examples about this? thx = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
In article <[EMAIL PROTECTED]>, Jerry Dallal <[EMAIL PROTECTED]> wrote: > (1) statistical significance usually is unrelated to practice > importance. I don't think so. I can think of many examples in which statistical inference plays an invaluable role in practical applications and instrumentation, or indeed any "practical" application of a theory etc. Not just in science, but engineering, e.g aircraft design, studying the brain, electrical enginerring. Certainly there are examples of statistical nonsense, e.g. polls, but i wouldn't go so far as to say it is usually like this. I *would* argue that without some method to determine the likelihood of a difference b/w two conditions you have no chance of determining practical importance at all. > (2) absence of evidence is not evidence of absence Everyone who has done elementary statistics is aware of this edict. But what if your power is very high and/or you have very large N? I have always found it surprising that we can't turn it around and develop a probability that two groups are the same. Power or beta is surely correlated with the certainty of this approach. Chris Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
> As to Observational studies -- > > http://www.cnr.colostate.edu/~anderson/thompson1.html > > This is a short article and long bibliography. The title is direct: > "326 Articles/Books Questioning the Indiscriminate Use of > Statistical Hypothesis Tests in Observational Studies" > (Compiled by William L. Thompson) This bibliography has many articles apparently discussing Fisher's views on p values. Can someone recommend a good book on the history of statistics, especially one focusing on Fisher's accomplishments. Fisher's contributions and prickly personality are dealt with tangentially in Provine's wonderful biography of Sewall Wright. Surely, Fisher has merited one or more scholarly biographies of his own. -- Eugene D. Gallagher ECOS, UMASS/Boston Sent via Deja.com http://www.deja.com/ Before you buy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
San wrote: > > Would there be some cases which the p-value are so difficult to find > that it's nearly impossible? I'm tempted to say "not under a randomization model" but, yes, there are many problems for which P values are not readily available. Perhaps P values are unavailable for *most* problems--it's just that we're so good at figuring out new uses for the cases we can solve! A good example of a simple situation for which exact P values are unavailable is the Behrens-Fisher problem (testing the equality of normal means from normal populations with unequal variances). Some might say we have approximate solutions that are good enough. > Is this a kind of limitation to the > hypothesis testing using p-value? Yes. Stepwise procedures (regression, in particular) are good examples. > Is there any substitute for the > p-value? Many. You could start with likelihood procedures, Bayes methods, and decision theory. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
On Thu, 12 Oct 2000, dennis roberts wrote in part: > one nice full issue of a journal about this general topic of > hull hypothesis testing ... Dealing with problems in naval architecture, one presumes? -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
Would there be some cases which the p-value are so difficult to find that it's nearly impossible? Is this a kind of limitation to the hypothesis testing using p-value? Is there any substitute for the p-value? Thx for ur reply. Jerry Dallal wrote: > > I wrote: > > > (1) statistical significance usually is unrelated to practice > > importance. > > I meant to type "practical importance". = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
one nice full issue of a journal about this general topic of hull hypothesis testing that i came across recently is: Research in the Schools, Vol 5, Number 2, Fall 1998 ... you could contact jim mclean at ... jmclean@ etsu.edu ... and inquire about obtaining a copy we are in the process of considering uploading these article files to a website ... but, the details have to be worked out ... this issue hits almost every salient issue with respect to this topic and, provides (along with the url that had 326 that rich ulrich sent) ... lots of good references on this topic At 01:42 PM 10/12/00 +, Jerry Dallal wrote: >I wrote: > > > (1) statistical significance usually is unrelated to practice > > importance. > >I meant to type "practical importance". > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
I wrote: > (1) statistical significance usually is unrelated to practice > importance. I meant to type "practical importance". = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
< also posted to sci.stat.math, sci.stat.consult where separate versions of the same question were posted. > On Wed, 11 Oct 2000 23:25:05 +0800, San <[EMAIL PROTECTED]> wrote: > What are the limitations of hypothesis testing using significance tests > based on p-values? > > Can someone suggest me where I can find some reference book related to > the topics above? > thank you As to Observational studies -- http://www.cnr.colostate.edu/~anderson/thompson1.html This is a short article and long bibliography. The title is direct: "326 Articles/Books Questioning the Indiscriminate Use of Statistical Hypothesis Tests in Observational Studies" (Compiled by William L. Thompson) -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
questions on hypothesis
What are the limitations of hypothesis testing using significance tests based on p-values? Can someone suggest me where I can find some reference book related to the topics above? thank you = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: questions on hypothesis
San wrote: > > What are the limitations of hypothesis testing using significance tests > based on p-values? > (1) statistical significance usually is unrelated to practice importance. (2) absence of evidence is not evidence of absence http://www.bmj.com/cgi/content/full/311/7003/485 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =