On Sep 26, 2012, at 11:23 AM, Rui Barradas wrote: > Hello, > > If I understand it correctly, something like this will get you what you want. > > > d <- Sys.Date() + 1:4 > d2 <- sample(d, 2) > dat <- data.frame(id = 1:6, date = c(d, d2), value = rnorm(6)) > > aggregate(dat, by = list(dat$date), FUN = tail, 1)
If these are sorted by date, then the oldest date would come first any you would want: aggregate(dat, by = list(dat$date), FUN = head, 1) -- David. > > Hope this helps, > > Rui Barradas > Em 26-09-2012 16:19, wwreith escreveu: >> I have several thousand rows of shipment data imported into R as a data >> frame, with two columns of particular interest, col 1 is the entry date, and >> col 2 is the tracking number (colname is REQ.NR). Tracking numbers should be >> unique but on occassion aren't because they get entered more than once. This >> creates two or more rows of with the same tracking number but different >> dates. I wrote a for loop that will keep the row with the oldest date but it >> is extremely slow. >> >> Any suggestions of how I should write this so that it is faster? >> >> # Creates a vector of on the unique tracking numbers # >> u<-na.omit(unique(Para.5C$REQ.NR)) >> >> # Create Data Frame to rbind unique rows to # >> Para.5C.final<-data.frame() >> >> # For each value in u subset Para.5C find the min date and rbind it to >> Para.5C.final # >> for(i in 1:length(u)) >> { >> x<-subset(Para.5C,Para.5C$REQ.NR==u[i]) >> Para.5C.final<-rbind(Para.5C.final,x[which(x[,1]==min(x[,1])),]) >> } >> -- David Winsemius, MD Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.