Re: [tips] curious statistical reasoning
Wow. In an era where repeated failures to replicate “sensational” psychological effects is all over the news, it is astonishing that any editor would have accepted this sloppy of argument (whether the can cite articles from the 1960s and ‘70s that used it as well or not). The solution to high Type II error rates is decidedly not to raise Type I error rates. The solution is to raise power by raising the sample size. Although it is true that the conventional alpha level of .05 is entirely arbitrary, in an era where thousands of psychological studies are published every year (rather than the mere dozens that were published annually back when Fisher first proposed it), the conventional Type I error rate should probably be tightened, not loosened (and the required sample sizes would have to go up for all but the largest effects). The article should have been rejected until the authors could demonstrate the same effect with and increased sample size. As the old saying goes, extraordinary claims require extraordinary evidence. Chris ….. Christopher D Green Department of Psychology York University Toronto, ON M3J 1P3 Canada chri...@yorku.ca http://www.yorku.ca/christo ... On Dec 11, 2014, at 2:18 PM, Ken Steele steel...@appstate.edu wrote: A colleague sent me a link to an article - https://www.insidehighered.com/news/2014/12/10/study-finds-gender-perception-affects-evaluations I took a look at the original article and found this curious footnote. Quoting footnote 4 from the study: While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973). Ken Kenneth M. Steele, Ph. D.steel...@appstate.edu Professor Department of Psychology http://www.psych.appstate.edu Appalachian State University Boone, NC 28608 USA --- You are currently subscribed to tips as: chri...@yorku.ca. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=430248.781165b5ef80a3cd2b14721caf62bd92n=Tl=tipso=40808 or send a blank email to leave-40808-430248.781165b5ef80a3cd2b14721caf62b...@fsulist.frostburg.edu --- You are currently subscribed to tips as: arch...@mail-archive.com. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5n=Tl=tipso=40838 or send a blank email to leave-40838-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
Re: [tips] curious statistical reasoning
The other way to increase effect size would be to improve experimental control (procedure). That would be consistent with this being basically a pilot study. On Dec 12, 2014, at 8:02 AM, Christopher Green chri...@yorku.ca wrote: Wow. In an era where repeated failures to replicate “sensational” psychological effects is all over the news, it is astonishing that any editor would have accepted this sloppy of argument (whether the can cite articles from the 1960s and ‘70s that used it as well or not). The solution to high Type II error rates is decidedly not to raise Type I error rates. The solution is to raise power by raising the sample size. Although it is true that the conventional alpha level of .05 is entirely arbitrary, in an era where thousands of psychological studies are published every year (rather than the mere dozens that were published annually back when Fisher first proposed it), the conventional Type I error rate should probably be tightened, not loosened (and the required sample sizes would have to go up for all but the largest effects). The article should have been rejected until the authors could demonstrate the same effect with and increased sample size. As the old saying goes, extraordinary claims require extraordinary evidence. Chris ….. Christopher D Green Department of Psychology York University Toronto, ON M3J 1P3 Canada chri...@yorku.ca http://www.yorku.ca/christo ... On Dec 11, 2014, at 2:18 PM, Ken Steele steel...@appstate.edu wrote: A colleague sent me a link to an article - https://www.insidehighered.com/news/2014/12/10/study-finds-gender-perception-affects-evaluations I took a look at the original article and found this curious footnote. Quoting footnote 4 from the study: While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973).” Paul Brandon Emeritus Professor of Psychology Minnesota State University, Mankato pkbra...@hickorytech.net --- You are currently subscribed to tips as: arch...@mail-archive.com. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5n=Tl=tipso=40843 or send a blank email to leave-40843-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
RE: [tips] curious statistical reasoning
Hi Seems like they could have gotten to the same point (perhaps) by using a directional hypothesis given points 1 2? Unless the .10 is directional and the non-directional p is .20? 3 does not make a lot of sense to me given p is sensitive to n? 4 might be an appropriate consideration given the consequences of the two possible errors. Not enough info here. Take care Jim Take care Jim Jim Clark Professor Chair of Psychology 204-786-9757 4L41A -Original Message- From: Ken Steele [mailto:steel...@appstate.edu] Sent: Thursday, December 11, 2014 1:19 PM To: Teaching in the Psychological Sciences (TIPS) Subject: [tips] curious statistical reasoning A colleague sent me a link to an article - https://www.insidehighered.com/news/2014/12/10/study-finds-gender-perception-affects-evaluations I took a look at the original article and found this curious footnote. Quoting footnote 4 from the study: While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973). Ken Kenneth M. Steele, Ph. D.steel...@appstate.edu Professor Department of Psychology http://www.psych.appstate.edu Appalachian State University Boone, NC 28608 USA --- You are currently subscribed to tips as: j.cl...@uwinnipeg.ca. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13251.645f86b5cec4da0a56ffea7a891720c9n=Tl=tipso=40808 or send a blank email to leave-40808-13251.645f86b5cec4da0a56ffea7a89172...@fsulist.frostburg.edu --- You are currently subscribed to tips as: arch...@mail-archive.com. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5n=Tl=tipso=40817 or send a blank email to leave-40817-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
RE: [tips] curious statistical reasoning
I think the main point is that this was basically designed to be a small pilot study so why even publish it? It is interesting that they decided to go with Welch's t (not assuming equal variances) for all of the calculations no matter what the variances were. With respect to Jim's inquiry, the probabilities seem to have been non-directional. In the case of the overall student rating index, a regular t test assuming equal variances would have produced a significant (p.05) result (ignoring the fact that they did 26 t tests). Also, since they did 26 t test comparisons (of which only three were significant at .05 and another three at .10), the Bonferroni correction would actually call for a more stringent alpha of .0019 instead of inflating it further to .10. On number three, I like how they said that they used a .10 significance level on some tests. I hope I am not being too cynical in believing that the ones they used a .10 significance level corresponded entirely with the ones where p was greater than .05 but less than .10. As to point four, there is a way to simultaneously decrease the probability of making a Type II error and increase the probability of making a Type I error: increase your sample size. Which brings me back to the first point. This was correctly conceived of as a pilot study so why stretch the stats and rush it to print? Rick Dr. Rick Froman Professor of Psychology Box 3519 x7295 rfro...@jbu.edu http://bit.ly/DrFroman -Original Message- From: Jim Clark [mailto:j.cl...@uwinnipeg.ca] Sent: Thursday, December 11, 2014 4:18 PM To: Teaching in the Psychological Sciences (TIPS) Subject: RE: [tips] curious statistical reasoning Hi Seems like they could have gotten to the same point (perhaps) by using a directional hypothesis given points 1 2? Unless the .10 is directional and the non-directional p is .20? 3 does not make a lot of sense to me given p is sensitive to n? 4 might be an appropriate consideration given the consequences of the two possible errors. Not enough info here. Take care Jim Take care Jim Jim Clark Professor Chair of Psychology 204-786-9757 4L41A -Original Message- From: Ken Steele [mailto:steel...@appstate.edu] Sent: Thursday, December 11, 2014 1:19 PM To: Teaching in the Psychological Sciences (TIPS) Subject: [tips] curious statistical reasoning A colleague sent me a link to an article - https://www.insidehighered.com/news/2014/12/10/study-finds-gender-perception-affects-evaluations I took a look at the original article and found this curious footnote. Quoting footnote 4 from the study: While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973). Ken Kenneth M. Steele, Ph. D.steel...@appstate.edu Professor Department of Psychology http://www.psych.appstate.edu Appalachian State University Boone, NC 28608 USA --- You are currently subscribed to tips as: j.cl...@uwinnipeg.ca. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13251.645f86b5cec4da0a56ffea7a891720c9n=Tl=tipso=40808 or send a blank email to leave-40808-13251.645f86b5cec4da0a56ffea7a89172...@fsulist.frostburg.edu --- You are currently subscribed to tips as: rfro...@jbu.edu. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13039.37a56d458b5e856d05bcfb3322db5f8an=Tl=tipso=40817 or send a blank email to leave-40817-13039.37a56d458b5e856d05bcfb3322db5...@fsulist.frostburg.edu --- You are currently subscribed to tips as: arch...@mail-archive.com. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5n=Tl=tipso=40820 or send a blank email to leave-40820-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
RE: [tips] curious statistical reasoning
In my last paragraph, I meant to say that there is a way of decreasing the probability of making a Type II error without increasing the probability of making a Type I error: increase the sample size. Dr. Rick Froman Professor of Psychology Box 3519 x7295 rfro...@jbu.edu http://bit.ly/DrFroman -Original Message- From: Rick Froman [mailto:rfro...@jbu.edu] Sent: Thursday, December 11, 2014 5:00 PM To: Teaching in the Psychological Sciences (TIPS) Subject: RE: [tips] curious statistical reasoning I think the main point is that this was basically designed to be a small pilot study so why even publish it? It is interesting that they decided to go with Welch's t (not assuming equal variances) for all of the calculations no matter what the variances were. With respect to Jim's inquiry, the probabilities seem to have been non-directional. In the case of the overall student rating index, a regular t test assuming equal variances would have produced a significant (p.05) result (ignoring the fact that they did 26 t tests). Also, since they did 26 t test comparisons (of which only three were significant at .05 and another three at .10), the Bonferroni correction would actually call for a more stringent alpha of .0019 instead of inflating it further to .10. On number three, I like how they said that they used a .10 significance level on some tests. I hope I am not being too cynical in believing that the ones they used a .10 significance level corresponded entirely with the ones where p was greater than .05 but less than .10. As to point four, there is a way to simultaneously decrease the probability of making a Type II error and increase the probability of making a Type I error: increase your sample size. Which brings me back to the first point. This was correctly conceived of as a pilot study so why stretch the stats and rush it to print? Rick Dr. Rick Froman Professor of Psychology Box 3519 x7295 rfro...@jbu.edumailto:rfro...@jbu.edu http://bit.ly/DrFroman -Original Message- From: Jim Clark [mailto:j.cl...@uwinnipeg.ca] Sent: Thursday, December 11, 2014 4:18 PM To: Teaching in the Psychological Sciences (TIPS) Subject: RE: [tips] curious statistical reasoning Hi Seems like they could have gotten to the same point (perhaps) by using a directional hypothesis given points 1 2? Unless the .10 is directional and the non-directional p is .20? 3 does not make a lot of sense to me given p is sensitive to n? 4 might be an appropriate consideration given the consequences of the two possible errors. Not enough info here. Take care Jim Take care Jim Jim Clark Professor Chair of Psychology 204-786-9757 4L41A -Original Message- From: Ken Steele [mailto:steel...@appstate.edu] Sent: Thursday, December 11, 2014 1:19 PM To: Teaching in the Psychological Sciences (TIPS) Subject: [tips] curious statistical reasoning A colleague sent me a link to an article - https://www.insidehighered.com/news/2014/12/10/study-finds-gender-perception-affects-evaluations I took a look at the original article and found this curious footnote. Quoting footnote 4 from the study: While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973). Ken Kenneth M. Steele, Ph. D. steel...@appstate.edumailto:steel...@appstate.edu Professor Department of Psychology http://www.psych.appstate.edu Appalachian State University Boone, NC 28608 USA --- You are currently subscribed to tips as: j.cl...@uwinnipeg.camailto:j.cl...@uwinnipeg.ca. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13251.645f86b5cec4da0a56ffea7a891720c9n=Tl=tipso=40808 or send a blank email to leave-40808-13251.645f86b5cec4da0a56ffea7a89172...@fsulist.frostburg.edumailto:leave-40808-13251.645f86b5cec4da0a56ffea7a89172...@fsulist.frostburg.edu --- You are currently subscribed to tips as: rfro...@jbu.edumailto:rfro...@jbu.edu. To unsubscribe click here: http