Greetings Everyone -
I have a data frame "x" that looks like this:
v1 v2
1 A
1 B
1 B
2 B
2 W
2 W
3 D
3 D
3 Z
What I would like to do is create a new data frame, "y", that has one row
for each unique value of v1, and returns the corresponding mode of v2. If I
were to run it on the above data frame, it should therefore return:
v1 v2
1 B
2 W
3 D
I've been using the following code:
x <- data.frame(v1 = c(1,1,1,2,2,2,3,3,3), v2 =
c("A","B","B","B","W","W","D","D","Z"))
y <- aggregate.data.frame(x, by = list(x$var1), FUN = "Mode")
which relies on the Mode function from package prettyR. The above code
works for me.
My problem comes when I use my real database. Running this produces many
warnings, because there are multiple modes of v2 for many values of v1. My
database is also rather large (~700,000 rows), and I'm wondering if there is
a faster way to get R to process these data.
Thank you for your help and consideration,
Gabriel Yospin
Center for Ecology and Evolutionary Biology
University of Oregon
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.