Or use ave() to compute the within-group ranks (reversed, so max has rank 1) and select the elements whose ranks are 1: f2 <- function (DATA) { stopifnot(is.data.frame(DATA), all(c("distance", "id", "month") %in% names(DATA))) revRanks <- ave(DATA[["distance"]], DATA[["id"]], DATA[["month"]], FUN = function(x) rank(-x, ties = "first")) DATA[revRanks == 1, ] }
Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Peter Alspach > Sent: Thursday, August 23, 2012 4:37 PM > To: rjb; r-help@r-project.org > Subject: Re: [R] Extracting data from dataframe with tied rows > > Tena koe John > > One way: > > johnData <- data.frame(id=rep(LETTERS[1:5],20), distance=rnorm(1:100, mean = > 100), > bearing=sample(1:360,100,replace=T), month=sample(1:12,100,replace=T)) > johnAgg <- aggregate(johnData[,'distance'], johnData[,c('id','month')], max) > names(johnAgg)[3] <- 'distance' > merge(johnAgg, johnData) > > HTH .... > > Peter Alspach > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of rjb > Sent: Friday, 24 August 2012 9:19 a.m. > To: r-help@r-project.org > Subject: [R] Extracting data from dataframe with tied rows > > Hi R help, > > I'm a fairly experienced R user but this manipulation has me stumped, please > help: > > DATA > id<-rep(LETTERS[1:5],20) > distance<-rnorm(1:100, mean = 100) > bearing<-sample(1:360,100,replace=T) > month<-sample(1:12,100,replace=T) > > I have a dataset with records of individuals (id) , each with a distance > (distance) & direction (bearing) recorded for each month (month). > I want to find the largest distance per individual per month, which is easy > with /tapply/ or /melt/cast (reshape)/, > head(DATA_m<-melt(DATA,id=c("id","month"))) > cast(DATA_m,id+month~.,max) > OR > na.omit(melt(tapply(distance,list(id,month),max))) > > *BUT THE CATCH IS* , > I also want the the *corresponding* bearing for that maximum distance per > month. I've tried the steps above plus using which.max() and loops, but > can't solve the problem. The real dataset is about 6000 rows. > > I'm guessing the answer is in finding the row number from the original DATA > but I can't figure how to do that with tapply or melt. > > Any suggestions would be greatly appreciated. > > John Burnside > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Extracting-data-from- > dataframe-with-tied-rows-tp4641140.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > The contents of this e-mail are confidential and may be subject to legal > privilege. > If you are not the intended recipient you must not use, disseminate, > distribute or > reproduce all or any part of this e-mail or attachments. If you have > received this > e-mail in error, please notify the sender and delete all material pertaining > to this > e-mail. Any opinion or views expressed in this e-mail are those of the > individual > sender and may not represent those of The New Zealand Institute for Plant and > Food Research Limited. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.