I can't quite seem to solve a problem subsetting a data frame. Here's a reproducible example.
Given a data frame: dat <- data.frame(fac = rep(c("a", "b"), each = 100), value = c(rnorm(130), rep(NA, 70)), other = rnorm(200)) What I want is a new data frame (with the same columns as dat) excluding the top 5% of "value" separately by "a" and "b". For example, this produces the results I'm after in an array: sub <- tapply(dat$value, dat$fac, function(x) x[x < quantile(x, probs = 0.95, na.rm = TRUE)]) My difficulty is putting them into a data frame along with the other columns "fac" and "other". Note that quantile will return different length vectors due to different numbers of NAs for a and b. There's something I'm just not seeing - can you help? Many thanks. David Carslaw ----- Institute for Transport Studies University of Leeds -- View this message in context: http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.