Re: One-tailed, two-tailed
In article [EMAIL PROTECTED], Stan Brown [EMAIL PROTECTED] writes I think I've got some sort of mental block on the following point. Can someone explain this to me, plainly and simply, please? Let me start with a sample problem, NOT created by me: [The student is led to enter two sets of unpaired figures into Excel. They represent miles per gallon with gasoline A and gasoline B. ... The question is whether there is a difference in gasoline mileage. The student is led to a two-sample F test for homoscedasticity; p=0.1886 so the samples are treated as homoscedastic. Now the problem says: ] Now the main t-test ... Two Sample Assuming Equal Variances. ... Use two-tail results (since '=/=' in Ha). ... What is the P-val for the t-test? [Answer: p=.0002885] What's your conclusion about the difference in gas mileage? [Answer: At significance level 5%, previously selected, there is a difference between them.] Now we come to the part I'm having conceptual trouble with: Have you proven that one gas gives better mileage than the other? If so, which one is better? Now obviously if the two are different then one is better, and if one is better it's probably B since B had the higher sample mean. But are we in fact justified in jumping from a two-tailed test (=/=) to a one-tailed result ()? Here we have a tiny p-value, and in fact a one-tailed test gives a p-value of 0.0001443. The significance value associated with the one-tailed test will always be half the significance value associated with the two-tailed test, so your cautious strategy will make the same decisions as doing just a two tailed test. I tend to think of significance tests as rulebooks designed to ensure long term false alarm rates if they are followed. The two tailed test consists of limits at +/- X set to ensure that if the true difference is zero, you get a false alarm (see a value outside +/-X) with probability according to the significance level. If the true state of affairs is that the true difference is (e.g.) 13.0, then you are correct if you declare that the difference is 0, and we are implicitly ignoring errors you make by declaring that the difference is 0. You can only make an error by declaring that the difference is 0. But when the true situation is that the difference is zero you are also making an error if you say that the difference is 0. In fact, it is easier to make the error in this situation, because you are more likely to see a -ve statistic when the true value is 0 than when the true value is 13. So the error rate from jumping the wrong way when there is a true difference is less than the error rate from jumping any way when there is no true difference, and you are justified in stating the direction of the distance. Note that this doesn't generalise very far - see any number of write-ups about testing for differences after ANOVA has dismissed the hypothesis that everything is equal to everything else. -- A. G. McDowell = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
[±¤°í] ÀüÈ¿ä±Ý °¡Àå ½Î°Ô ¾²´Â ¹æ¹ý
Title: Untitled Document ÀϹÝÀüȸ¦ »ç¿ëÇÏ¿© ÀÎÅÍ³Ý ÅëÈ ±¸Çö 1°³ÀÇ ÀÎÅͳݶóÀÎÀ¸·Î ÀÎÅÍ³Ý ÀüÈ ¹× PC ÀÎÅÍ³Ý µ¿½Ã»ç¿ë(NAT ±â´É Áö¿ø) ´Ù¾çÇÑ ³×Æ®¿öÅ© ¸ðµå Áö¿ø -- Àü¿ë¼±,ADSL,CABLE ÀÎÅͳݸÁ ºÒ¾ÈÁ¤½Ã PSTN(ÀϹÝÀüȸÁ)À¸·Î ÅëÈ°¡´É ±â¾÷¿ë °ÔÀÌÆ® ¿þÀÌ, IPÆù, USBÆù °£ÀÇ Outbound, Inbound ÄÝ Áö¿øÀ¸·Î ¹«·áÅëÈ Áö¿ø ¶Ù¾î³ ÅëÈ Ç°Áú FXS ±â´É Áö¿øÀ¸·Î ±â¾÷³» PBX(or k/pº»Ã¼) ¿ÍÀÇ ¸ÁÁ¤ÇÕÀ» ÅëÇÑ ÅëÈÁö¿ø ÃÖ»óÀÇ ÅëÈ Ç°Áú PCÀÇ USB Æ÷Æ® ÀÌ¿ë (Plug Play Áö¿ø) ÀÚü »ç¿îµåÄ«µå ³»Àå S/W ¹«·á ´Ù¿î·Îµå IP Phone°ú µ¿ÀÏÇÑ Åëȹæ¹ý ÀÌ¿ë ±âº»¿ä±Ý¸¸À¸·Îµµ Àü¼¼°è ÀÎÅͳÝÀüÈ°£ ¹«·áÅëÈ¡Ø ¿ä±ÝÁ¦µµ Gateway, USB Phone °øÅë - ¿ù 4,000¿øÀ¸·Î ¿¬ÀÎ,Ä£±¸,°¡Á·,µ¿È£È¸¿ø°£ ¹«Á¦ÇÑ ÅëÈ- Àü±¹ ¾îµðµç 39¿ø, ÃÖ´ë ÈÞ´ëÆù 21%,±¹Á¦ÀüÈ 95%·Î Àú·Å- Àü¼¼°è 230°³±¹ ÅëÈ ¹× ÇØ¿Ü¿¡¼ ÀÚµ¿ ·Î¹ÖÀÌ °¡´É ¢º http://www.hellotel.co.kr¸ÞÀϼö½Å°ÅºÎ¸¦ ¿øÇϽøé '¼ö½Å°ÅºÎ'¶ó°í Ç¥±âÇÏ¿© º¸³»Áֽñ⠹ٶø´Ï´Ù. ¨Ï Copyright 2001 Çï·Î¿ìÅÚ All rights reserved. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Which one fit better??
I plotted a histogram density of my data and its smooth version using the normal kernel function. I tried to plot the estimated PDF (Laplacian Generalised Gaussian) estimated using maximum likelihood method on top as well. Graphically, its seems that Laplacian wil fit thr histogram density graph better while the Generalised Gaussian will fit the smooth version (i.e. the kernel densoty version). What justification I could made from here?? Which one will fit better then??Or do I need to perform so goodness-of-fit test before I could say which estimated PDF will fit better?? Thanks.. CCC = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
A. G. McDowell [EMAIL PROTECTED] wrote in sci.stat.edu: The significance value associated with the one-tailed test will always be half the significance value associated with the two-tailed test, For means, yes. Not for proportions, I think. (I wasn't asking about a proportion in my original query.) If the true state of affairs is that the true difference is (e.g.) 13.0, then you are correct if you declare that the difference is 0, and we are implicitly ignoring errors you make by declaring that the difference is 0. You can only make an error by declaring that the difference is 0. But when the true situation is that the difference is zero you are also making an error if you say that the difference is 0. In fact, it is easier to make the error in this situation, because you are more likely to see a -ve statistic when the true value is 0 than when the true value is 13. So the error rate from jumping the wrong way when there is a true difference is less than the error rate from jumping any way when there is no true difference, and you are justified in stating the direction of the distance. I really like this way of explaining it; thanks. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com/ Don't move, or I'll fill you full of [... pause ...] little yellow bolts of light. -- Farscape, first episode = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
On Sun, 30 Dec 2001, Stan Brown wrote in part: A. G. McDowell [EMAIL PROTECTED] wrote: The significance value associated with the one-tailed test will always be half the significance value associated with the two-tailed test, For means, yes. Not for proportions, I think. Oh? Why not? Is there something about proportions that militates against assigning 1/2 alpha to each tail of the sampling distribution? snip, the rest -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Missing data cell problem
In trying to clear out my e-mail inbox, I came across this post, for which there seemed not to have been any responses. On Fri, 2 Feb 2001, Caroline Brown wrote: I have an analysis problem, which I am researching solutions to, and David Howell of UVM suggested I mail the query to you. My problem is how to deal with a two way- repeated measures design, in which one cell could not be measured: A1 A2 A3 B1 ok ok ok B2 - ok ok B3 ok ok ok B4 ok ok ok There is a good theoretical reason for this absence, as levels of factor A are set sizes, and A1 is one item, Factor B is cueing to spatial location and in the 1 item set size, there are no other items competing for 'encoding' resources (thus there can be no INVALID cue). If you know of any texts or papers on this issue, or have any thoughts as to its solution, I would be most grateful. One approach is to estimate the cell mean in the A1B2 cell, under the constraint that it not contribute to the AxB interaction; and then carry out the usual 2-way ANOVA (but with one fewer d.f. for interaction). If we use the following two contrasts, one for main effects in A and one for main effects in B, their product represents a contrast involving the 12 cells. Set that contrast equal to zero (so it doesn't contribute to the interaction SS. (All other interaction contrasts orthogonal to this one will not involve the missing cell.) For A: 2A1 - A2 - A3. For B: -B1 + 3B2 - B3 - B4. Product contrast: -2A1B1 + A2B1 + A3B1 + 6A1B2 - 3A2B2 - 3A3B2 - 2A1B3 + A2B3 + A3B3 - 2A1B4 + A2B4 + A3B4 = 0, whence A1B2 = (2A1B1 - A2B1 - A3B1 + 3A2B2 + 3A3B2 + 2A1B3 - A2B3 - A3B3 + 2A1B4 - A2B4 - A3B4)/6 (where 2A1B3 = twice the cell mean in the (A1,B3) cell, etc.) You now have cell means for each cell and can carry out the usual ANOVA. Because the estimated value of A1B2 infects your A1 average and your B2 average, the row and column effects (sources A and B in the ANOVA) are not, strictly speaking, independent; although the A2:A3 contrast is independent of contrasts involving only B1, B3, B4. Hope this helps (if belatedly!). -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
Hi Stan, This is sent to both you and edstat. Have you proven that one gas gives better mileage than the other? If so, which one is better? There are two points. The first is that you have not 'proved' anything - except in the most casual interpretation of 'proof'. What you have done is provide an answer in which you can be very confident to the question posed. So the first amendmentment is to something like: Can you reasonably conclude that one gas gives better mileage than the other? If so, which one is better? Second, the question is confusingly - sloppily - posed. It appears to be two questions. The first leads to a two tailed test - does one gas give better mileage than the other? This is the question that is answered. The second question leads to a one tailed test, which is the one you are trying to answer, I gather as an extra to the original question. As soon as you try to answer both questions simultaneously you run into logical problems. You *have* to be very clear from the start which of the two you are interested in. In this case, do you only want to know (in the sense of 'conclude with some confidence') if: * one gas is better than the other (so you will do a two sided test); or * gas B is better than gas A ( so you will do a two sided test). (You can also pose the question whether gas A is better than gas A, but the sample evidence is obviously against this.) This is one of the bits that causes students most problems - identifying the question being asked! It also seems to be a problem with many researchers, Yet it is fundamental to research. Happy New Year, Alan Stan Brown wrote: I think I've got some sort of mental block on the following point. Can someone explain this to me, plainly and simply, please? Let me start with a sample problem, NOT created by me: [The student is led to enter two sets of unpaired figures into Excel. They represent miles per gallon with gasoline A and gasoline B. I won't give the actual figures, but here's a summary: A: mean = 21.9727, variance = 0.4722, n = 11 B: mean = 22.9571, variance = 0.2165, n = 14 The question is whether there is a difference in gasoline mileage. The student is led to a two-sample F test for homoscedasticity; p=0.1886 so the samples are treated as homoscedastic. Now the problem says: ] Now the main t-test ... Two Sample Assuming Equal Variances. ... Use two-tail results (since '=/=' in Ha). ... What is the P-val for the t-test? [Answer: p=.0002885] What's your conclusion about the difference in gas mileage? [Answer: At significance level 5%, previously selected, there is a difference between them.] Now we come to the part I'm having conceptual trouble with: Have you proven that one gas gives better mileage than the other? If so, which one is better? Now obviously if the two are different then one is better, and if one is better it's probably B since B had the higher sample mean. But are we in fact justified in jumping from a two-tailed test (=/=) to a one-tailed result ()? Here we have a tiny p-value, and in fact a one-tailed test gives a p-value of 0.0001443. But something seems a little smarmy about first setting out to discover whether there is a difference -- just a difference, unequal means -- then computing a two-tailed test and deciding to announce a one-tailed result. Am I being over-scrupulous here? Am I not even asking the right question? Thanks for any enlightenment. (If you send me an e-mail copy of a public follow-up, please let me know that it's a copy so I know to reply publicly.) -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com/ My theory was a perfectly good one. The facts were misleading. -- /The Lady Vanishes/ (1938) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
Rich Ulrich [EMAIL PROTECTED] wrote in sci.stat.edu: [ posted and e-mailed.] Ditto. On Sat, 29 Dec 2001 16:46:10 -0500, [EMAIL PROTECTED] (Stan Brown) wrote: Now we come to the part I'm having conceptual trouble with: Have you proven that one gas gives better mileage than the other? If so, which one is better? Now obviously if the two are different then one is better, and if one is better it's probably B since B had the higher sample mean. I want to raise an eyebrow at this earlier statement. Hmm... Which earlier statement do you mean? If two means are different then one of them _must_ be larger than the other; that's how real numbers work. Can you explain your raised eyebrow a bit more specifically? Or is it just the word proven, about which I comment below. We should not overlook the chance to teach our budding statisticians: *Always* pay attention to the distinction between random trials or careful controls, on the one hand; and grab-samples on the other. [Maybe your teacher asked the question that way, in order to lead up to that in class?] No; this was in a book of homework problems, which is pretty standard at the junior college where I teach. Specifically, it was a lengthy exercise in using Excel to do the sort of statistical tests the students normally do on a TI83. The numbers do not *prove* that one gas gives better mileage; the mileage was, indeed, better for one gas than another -- for reasons yet to be discussed. Different cars? drivers? routes? All good points for discussion. But I wouldn't focus too much on that off-the-cuff word prove. (I'm not being defensive since I didn't write the exercise. :-) My students did understand that nothing is ever proved; that there's still a p-value chance of getting the sample results you got even if you did perfect random selection an d the null hypothesis is true. Maybe I'm being UNDER- scrupulous here, but I think it a pardonable bit of sloppy language. But are we in fact justified in jumping from a two-tailed test (=/=) to a one-tailed result ()? Here we have a tiny p-value, and in fact a one-tailed test gives a p-value of 0.0001443. But something seems a little smarmy about first setting out to discover whether there is a difference -- just a difference, unequal means -- then computing a two-tailed test and deciding to announce a one-tailed result. Another small issue. Why did the .00014 appear? I added that for purposes of posting; the original exercise didn't have the students do a one-tailed test at all. It's just half the two-tailed p-value, as I'm sure you recognize. In clinical trials, we observe the difference and then we do attribute it to one end. But it is not the convention to report the one-tailed p-level, after the fact. I think there are editors who would object to that, but that is a guess. Also, for various reasons, our smallest p-level for reporting is usually 0.001. Well, these two p-values are smaller than that: you're talking significance level of 0.1% and these were 0.014% or 0.029%. But my question was not about reporting a smaller p-value; it was about first establishing a two-tailed difference and then moving from that to declaring which side the difference lies on. I think A.G. McDowell has disposed of that, however. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com/ My theory was a perfectly good one. The facts were misleading. -- /The Lady Vanishes/ (1938) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
[ posted and e-mailed.] On Sat, 29 Dec 2001 16:46:10 -0500, [EMAIL PROTECTED] (Stan Brown) wrote: [... ] [The student is led to enter two sets of unpaired figures into Excel. They represent miles per gallon with gasoline A and gasoline B. I won't give the actual figures, but here's a summary: A: mean = 21.9727, variance = 0.4722, n = 11 B: mean = 22.9571, variance = 0.2165, n = 14 The question is whether there is a difference in gasoline mileage. [snip, about variances.] What's your conclusion about the difference in gas mileage? [Answer: At significance level 5%, previously selected, there is a difference between them.] Now we come to the part I'm having conceptual trouble with: Have you proven that one gas gives better mileage than the other? If so, which one is better? Now obviously if the two are different then one is better, and if one is better it's probably B since B had the higher sample mean. I want to raise an eyebrow at this earlier statement. We should not overlook the chance to teach our budding statisticians: *Always* pay attention to the distinction between random trials or careful controls, on the one hand; and grab-samples on the other. [Maybe your teacher asked the question that way, in order to lead up to that in class?] The numbers do not *prove* that one gas gives better mileage; the mileage was, indeed, better for one gas than another -- for reasons yet to be discussed. Different cars? drivers? routes? But are we in fact justified in jumping from a two-tailed test (=/=) to a one-tailed result ()? Here we have a tiny p-value, and in fact a one-tailed test gives a p-value of 0.0001443. But something seems a little smarmy about first setting out to discover whether there is a difference -- just a difference, unequal means -- then computing a two-tailed test and deciding to announce a one-tailed result. Another small issue. Why did the .00014 appear? In clinical trials, we observe the difference and then we do attribute it to one end. But it is not the convention to report the one-tailed p-level, after the fact. I think there are editors who would object to that, but that is a guess. Also, for various reasons, our smallest p-level for reporting is usually 0.001. Am I being over-scrupulous here? Am I not even asking the right question? Thanks for any enlightenment. (If you send me an e-mail copy of a public follow-up, please let me know that it's a copy so I know to reply publicly.) -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =