Here's another pretty straightforward solution, using the plyr pkg:
DF <- data.frame(id, month, distance, bearing)
# variables as defined in the OP
require(plyr)
DF1<-ddply(DF, .(id,month), summarize,
maxdist = max(distance),
maxbearing = bearing[which.max(distance)])
Peter Ehlers
On 2012-08-24 09:54, William Dunlap wrote:
Or use ave() to compute the within-group ranks (reversed, so max has rank 1)
and select
the elements whose ranks are 1:
f2 <- function (DATA)
{
stopifnot(is.data.frame(DATA), all(c("distance", "id", "month") %in%
names(DATA)))
revRanks <- ave(DATA[["distance"]], DATA[["id"]], DATA[["month"]],
FUN = function(x) rank(-x, ties = "first"))
DATA[revRanks == 1, ]
}
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf
Of Peter Alspach
Sent: Thursday, August 23, 2012 4:37 PM
To: rjb; r-help@r-project.org
Subject: Re: [R] Extracting data from dataframe with tied rows
Tena koe John
One way:
johnData <- data.frame(id=rep(LETTERS[1:5],20), distance=rnorm(1:100, mean =
100),
bearing=sample(1:360,100,replace=T), month=sample(1:12,100,replace=T))
johnAgg <- aggregate(johnData[,'distance'], johnData[,c('id','month')], max)
names(johnAgg)[3] <- 'distance'
merge(johnAgg, johnData)
HTH ....
Peter Alspach
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf
Of rjb
Sent: Friday, 24 August 2012 9:19 a.m.
To: r-help@r-project.org
Subject: [R] Extracting data from dataframe with tied rows
Hi R help,
I'm a fairly experienced R user but this manipulation has me stumped, please
help:
DATA
id<-rep(LETTERS[1:5],20)
distance<-rnorm(1:100, mean = 100)
bearing<-sample(1:360,100,replace=T)
month<-sample(1:12,100,replace=T)
I have a dataset with records of individuals (id) , each with a distance
(distance) & direction (bearing) recorded for each month (month).
I want to find the largest distance per individual per month, which is easy
with /tapply/ or /melt/cast (reshape)/,
head(DATA_m<-melt(DATA,id=c("id","month")))
cast(DATA_m,id+month~.,max)
OR
na.omit(melt(tapply(distance,list(id,month),max)))
*BUT THE CATCH IS* ,
I also want the the *corresponding* bearing for that maximum distance per
month. I've tried the steps above plus using which.max() and loops, but
can't solve the problem. The real dataset is about 6000 rows.
I'm guessing the answer is in finding the row number from the original DATA
but I can't figure how to do that with tapply or melt.
Any suggestions would be greatly appreciated.
John Burnside
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.