Hello,

I am relatively new to R, and I am trying to select the last  
observation within a group, where the group is defined by two  
variables.  One of the variables is a date.

In the below example, C3 varies within C2, which varies within C1. I  
need to select the last observation in C3 for 4 groups (C1*C2):  1x,  
1y, 2x, and 2y.  In my real dataset, C2 is a date (mm/dd/yy)

C1      C2      C3
1       x       1
1       x       2
1       y       1
1       y       2
2       x       1
2       x       2
2       y       1
2       y       2

I have found code (from UCLA R FAQs and this list's archives) for  
selecting the last observation when a group is defined by ONE variable  
(e.g., C1):

last <-by(mydata, mydata$C1, tail, n=1)
lastd<-do.call("rbind", as.list(last))

The by function does not seem to allow two variables in the Indices  
argument:
last <-by(mydata, mydata$C1 mydata$C2, tail, n=1) THIS DOESN'T WORK

I tried creating a new variable C1*C2, but I think this is risky since  
it may not be unique depending on my values of C1 and C2 (I have a  
very large dataset)

Thank you for the help,



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to