On 1/25/2008 11:39 AM, Karin Lagesen wrote: > I have a data frame containing columns which are factors. I use this > to make boxplots for the data, with one box per factor. I would now > like to get at the data in the data frame which corresponds to the > outliers. I have so far found the $out, which gives "the values of any > data points which lie beyond the extremes of the whiskers", but I > haven't found anything which will let me get at the indices in the > original data frame for these outliers. > > I think there might be a chance that I could simply compare the values > I am plotting from my data frame with the values for the whiskers and > use that as a criteria, but I am unsertain of how to do this withhout > doing it manually. The factor I am plotting against contains 17 > levels, and I'd thus like to see if there is a somewhat more general > solution available. > > Thanks for your help! > > Karin
You can use the %in% operator (is.element) to see which data values in your data frame match an outlier value. Then use which() to return the TRUE indices. For example: set.seed(245) df <- data.frame(GRP = rep(LETTERS[1:4], each=25), Y = rchisq(100, 2)) mybp <- boxplot(Y ~ GRP, data=df) which(df$Y %in% mybp$out) [1] 8 12 47 66 88 93 mybp$out [1] 5.919915 9.135578 5.723714 8.758584 8.502147 4.920513 df$Y[which(df$Y %in% mybp$out)] [1] 5.919915 9.135578 5.723714 8.758584 8.502147 4.920513 See ?is.element and ?which. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.