Perfect, except for one little bit... topN.2 is missing one comma... It should read as follows topN.2 <- function(data,n=5) data[order(data[,3], decreasing=T),][1:n,] Thank you very much. cn
________________________________ From: Petr PIKAL [mailto:[EMAIL PROTECTED] Sent: Mon 9/3/2007 3:51 AM To: Cory Nissen Cc: r-help@stat.math.ethz.ch Subject: RE: [R] by group problem Hi now I understand better what you want topN.2 <- function(data,n=5) data[order(data[,3], decreasing=T),][1:n] # I presume data is data frame with 3 columns and the third is percent lapply(split(data,data$state), topN.2) Regards Petr [EMAIL PROTECTED] "Cory Nissen" <[EMAIL PROTECTED]> napsal dne 31.08.2007 17:21:01: > That didn't work for me... > > Here's some data to help with a solution. > > data <- NULL > data$state <- c(rep("Illinois", 10), rep("Wisconsin", 10)) > data$county <- c("Adams", "Brown", "Bureau", "Cass", "Champaign", > "Christian", "Coles", "De Witt", "Douglas", "Edgar", > "Adams", "Ashland", "Barron", "Bayfield", "Buffalo", > "Burnett", "Chippewa", "Clark", "Columbia", "Crawford") > data$percentOld <- c(17.554849, 16.826594, 18.196593, 17.139242, 8.743823, > 17.862746, 13.747967, 16.626302, 15.258940, 18.984435, > 19.347022, 17.814436, 16.903067, 17.632781, 16.659305, > 20.337817, 14.293354, 17.252820, 15.647179, 16.825596) > > return something like this... > $Illinois > "Edgar" > 18.984435 > "Bureau" > 18.196593 > ... > $Wisconsin > "Burnett" > 20.33782 > "Adams" > 19.34702 > ... > > My Solution gives... > topN <- function(column, n=5) > { > column <- sort(column, decreasing=T) > return(column[1:n]) > } > tapply(data$percentOld, data$state, topN) > > $Illinois > [1] 18.98444 18.19659 17.86275 17.55485 17.13924 > $Wisconsin > [1] 20.33782 19.34702 17.81444 17.63278 17.25282 > > I get an error with this try... > aggregate(data$percentOld, list(data$state, data$county), topN) > > Error in aggregate.data.frame(as.data.frame(x), ...) : > 'FUN' must always return a scalar > > Thanks > > cn > > > > From: Petr PIKAL [mailto:[EMAIL PROTECTED] > Sent: Fri 8/31/2007 8:15 AM > To: Cory Nissen > Cc: r-help@stat.math.ethz.ch > Subject: Odp: [R] by group problem > Hi > > > I am working with census data. My columns of interest are... > > > > PercentOld - the percentage of people in each county that are over 65 > > County - the county in each state > > State - the state in the US > > > > There are about 3100 rows, with each row corresponding to a county > within a state. > > > > I want to return the top five "PercentOld" by state. But I want the > County > > and the Value. > > > > I tried this... > > > > topN <- function(column, n=5) > > { > > column <- sort(column, decreasing=T) > > return(column[1:n]) > > } > > top5PerState <- tapply(data$percentOld, data$STATE, topN) > > Try > > aggregate(data$PercentOld, list(data$State, data$County), topN) > > Regards > Petr > > > > > > But this only returns the value for "percentOld" per state, I also want > the > > corresponding County. > > > > I think I'm close, but I just can't get it... > > > > Thanks > > > > cn > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.