Hi All,
Is it possible to use the subset() function to select data based on multiple
values of a single variable from a data frame.
My actual data set is much bigger and would like to illustrate with
following dataset
> df = data.frame(x = c('a','b','c','d','e','f','g','h','a','a','b','b'), y
= 1:12)
I would like to select all rows where x = a or b.
> subset(df, x == c('a','b')) # this command did not return all rows where x
is equal to a or b
x y
1 a 1
2 b 2
9 a 9
12 b 12
> df[df$x %in% c('a','b'),] # subsetting using subscripts returned all rows
x y
1 a 1
2 b 2
9 a 9
10 a 10
11 b 11
12 b 12
I know there might be a problem with subset syntax that I have used, but
could'nt figure out what it is. Any insights from members will be highly
appreciated and thanks for the same.
Regards,
S.N.V. Krishna
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.