It is, perhaps, more apt to call the tests of outliers as "tests of outright liars".
"Lies, damned lies, and tests of outliers" Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu ----- Original Message ----- From: Bert Gunter <gunter.ber...@gene.com> Date: Tuesday, November 30, 2010 4:22 pm Subject: Re: [R] Outlier statistics question To: Jahan <jahan.mohiud...@gmail.com> Cc: r-help@r-project.org > (Apologies to all. I am weak and could not resist) > > On Tue, Nov 30, 2010 at 12:15 PM, Jahan <jahan.mohiud...@gmail.com> wrote: > > I have a statistical question. > > The data sets I am working with are right-skewed so I have been > > plotting the log transformations of my data. I am using a Grubbs Test > > to detect outliers in the data, but I get different outcomes depending > > on whether I run the test on the original data or the log(data). > > Of course! > > Here > > is one of the problematic sets: > > > > fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047) > > stripchart(fgf2p50,vertical=TRUE) > > #This next step requires you have the 'outliers' package > > library(outliers) > > grubbs.test(fgf2p50) > > #the output says p<0.05 so 5.047 is an outlier > > #Next, I run the test on the log(data) > > log10=c(0.194,0.335,0.403,0.436,0.388,0.703) > > grubbs.test(log10) > > #output is that p>0.05 so we reject that there is an outlier. > > > > The question is, which outlier test do I accept? > > Neither. > > (IMHO) Outlier tests are one of statistics's _bad ideas._ The Grubbs > test is ca 1970 . There are many better approaches these days -- > consult your local statistician -- all of which will depend on > answering the question, "What is the question you are trying to > answer?" > > -- Bert > > > > > ______________________________________________ > > R-help@r-project.org mailing list > > > > PLEASE do read the posting guide > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > R-help@r-project.org mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.