On Thu, 23 Sep 2004 [EMAIL PROTECTED] wrote: > Hi, > this is both a statistical and a R question... > what would the best way / test to detect an outlier value among a series of 10 to 30 > values ? for instance if we have the following dataset: 10,11,12,15,20,22,25,30,500 > I d like to have a way to identify the last data as an outlier (only one direction). > One way would be to calculate abs(mean - median) and if elevated (to what extent ?) > delete the extreme data then redo.. but is it valid to do so with so few data ? is > the (trimmed mean - mean) more efficient ? if so, what would be the maximal > tolerable value to use as a threshold ? (I guess it will be experiment dependent...) > tests for skweness will probably required a larger dataset ? > any suggestions are very welcome ! > thanks for your help > Philippe Guardiola, MD > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
You may want to read Davies and Gather, The identification of multiple outliers, JASA 88 (1993), 782-801. The simplest recommendation is to nominate all points with distance larger than c*mad(data) from the median as outliers. Choices of c depending on n are given in the above paper. This is somewhat better founded theoretically than the boxplot method recommended by Gabor G., but it is based on the assumption that the distribution on the non-outliers is close to the normal and especially not strongly skewed (the boxplot method seems to be a bit more robust against skewness). Christian *********************************************************************** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag-online.de ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html