Hello, I am relatively new to R, and I am trying to select the last observation within a group, where the group is defined by two variables. One of the variables is a date.
In the below example, C3 varies within C2, which varies within C1. I need to select the last observation in C3 for 4 groups (C1*C2): 1x, 1y, 2x, and 2y. In my real dataset, C2 is a date (mm/dd/yy) C1 C2 C3 1 x 1 1 x 2 1 y 1 1 y 2 2 x 1 2 x 2 2 y 1 2 y 2 I have found code (from UCLA R FAQs and this list's archives) for selecting the last observation when a group is defined by ONE variable (e.g., C1): last <-by(mydata, mydata$C1, tail, n=1) lastd<-do.call("rbind", as.list(last)) The by function does not seem to allow two variables in the Indices argument: last <-by(mydata, mydata$C1 mydata$C2, tail, n=1) THIS DOESN'T WORK I tried creating a new variable C1*C2, but I think this is risky since it may not be unique depending on my values of C1 and C2 (I have a very large dataset) Thank you for the help, [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.