I can't quite seem to solve a problem subsetting a data frame.  Here's a
reproducible example. 

Given a data frame:

dat <- data.frame(fac = rep(c("a", "b"), each = 100),
                  value = c(rnorm(130), rep(NA, 70)),
                  other = rnorm(200))

What I want is a new data frame (with the same columns as dat) excluding the
top 5% of "value" separately by "a" and "b". For example, this produces the
results I'm after in an array:

sub <- tapply(dat$value, dat$fac, function(x) x[x < quantile(x, probs =
0.95, na.rm = TRUE)]) 

My difficulty is putting them into a data frame along with the other columns
"fac" and "other". Note that quantile will return different length vectors
due to different numbers of NAs for a and b.

There's something I'm just not seeing - can you help?

Many thanks.

David Carslaw

-----
Institute for Transport Studies
University of Leeds
-- 
View this message in context: 
http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to