[R] Help with apply and split...

2008-05-02 Thread Mike H. Ryu
I'm trying to drop all rows except for the ones with the most recent year. So I split the data frame by NPERMNO and keep just the last record of all groups. datg=t(sapply(split(datgic, datgic$NPERMNO, drop=TRUE), function(x){return( x[nrow(x),] )})) I get something like this... GVKEY NPE

Re: [R] Help with apply and split...

2008-05-02 Thread jim holtman
What you are seeing are the row numbers of the original locations. If datgic is really a data frame, this is probably what you want using lapply and do.call: > x <- data.frame(a=letters, b=sample(1:4, 26, TRUE)) > y <- lapply(split(x, x$b), tail, 1) > do.call(rbind, y) a b 1 z 1 2 y 2 3 w 3 4 x

Re: [R] Help with apply and split...

2008-05-02 Thread Richard . Cotton
> datg=t(sapply(split(datgic, datgic$NPERMNO, drop=TRUE), function(x){return( > x[nrow(x),] )})) > > > I get something like this... > > GVKEY NPERMNO GIC year > 10001 12994 10001 55102010 2007 > 10002 19049 10002 40101015 2007 > 10009 16739 10009 40101010 1999 > > Has this