On 09-04-2013, at 13:12, Lorna <lor...@essex.ac.uk> wrote: > Hi Everyone, > > I have a very long list of data-points (+2300) and i know from my histogram > that there are outliers which are affecting my mean. > > I was wondering if anyone on here knows a way i can quickly get R to > calculate and remove data which is 3 standard deviations from the mean? I am > hoping this will tidy my data and give me a repeatable method of tidying for > future data collection. > > Please if you do post code, make it as user friendly as possible! I am not a > very good programmer, i can load my data into R and do basic stats on it > however i havent tried much else....
# some test data + standard deviation of same testdata <- rnorm(100,0,5) sd.td <- sd(testdata) # threshold (set to 3.0 for your specific situation) alpha <- 1.5 # determine which items fall within bounds and select them pidx <- (testdata<mean(testdata)+alpha*sd.td) & (testdata>mean(testdata)-alpha*sd.td) testdata[pidx] Berend ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.