Re: [R] Outlier statistics question

Nordlund, Dan (DSHS/RDA) Tue, 30 Nov 2010 13:06:39 -0800

> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Jahan
> Sent: Tuesday, November 30, 2010 12:16 PM
> To: r-help@r-project.org
> Subject: [R] Outlier statistics question
> 
> I have a statistical question.
> The data sets I am working with are right-skewed so I have been
> plotting the log transformations of my data.  I am using a Grubbs Test
> to detect outliers in the data, but I get different outcomes depending
> on whether I run the test on the original data or the log(data).  Here
> is one of the problematic sets:
> 
> fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)
> stripchart(fgf2p50,vertical=TRUE)
> #This next step requires you have the 'outliers' package
> library(outliers)
> grubbs.test(fgf2p50)
> #the output says p<0.05 so 5.047 is an outlier
> #Next, I run the test on the log(data)
> log10=c(0.194,0.335,0.403,0.436,0.388,0.703)
> grubbs.test(log10)
> #output is that p>0.05 so we reject that there is an outlier.
> 
> The question is, which outlier test do I accept?
>


You may not want to "accept" either test.  What do YOU mean by an outlier, and 
why is it important for you to detect and handle "outliers" differently?  Maybe 
you should model the data so that the model correctly predicts or explains the 
so-called outlier.  So, what is it that you are wanting to do?

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Outlier statistics question

Reply via email to