Hi, You could also check ?data.table() as it could be faster.
#Speed comparison set.seed(498) oilT <- data.frame(YEAR=rep(rep(1800:2012,50),100),state=rep(rep(state.abb,each=213),100),value=sample(2000:80000,1065000,replace=TRUE),stringsAsFactors=FALSE) system.time(res1 <- oilT[as.logical(with(oilT,ave(value,list(YEAR),FUN=function(x) x%in% max(x)))),]) # user system elapsed # 0.532 0.008 0.540 dim(res1) #as some years have duplicated maximums #[1] 220 3 res1[duplicated(res1[,1])|duplicated(res1[,1],fromLast=TRUE),] library(data.table) dt1 <- data.table(oilT,key='YEAR') system.time( res2 <- dt1[dt1[,value %in% max(value),'YEAR']$V1]) # user system elapsed # 0.060 0.000 0.062 res1 <- res1[order(res1$YEAR),] row.names(res1) <- 1:nrow(res1) identical(res1,as.data.frame(res2)) #[1] TRUE A.K. On Thursday, October 17, 2013 1:35 PM, arun <smartpink...@yahoo.com> wrote: Hi, You may try: unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) # CA ND #40000 60000 #or library(reshape2) datM <- melt(oil,id.var="YEAR") datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in% max(x)))),] # YEAR variable value #3 2011 CA 40000 #8 2012 ND 60000 A.K. On Thursday, October 17, 2013 12:50 PM, Tim Umbach <tim.umb...@hufw.de> wrote: Hi there, another beginners question, I'm afraid. Basically i want to selct the maximum of values, that correspond to different variables. I have a table of oil production that looks somewhat like this: oil <- data.frame( YEAR = c(2011, 2012), TX = c(20000, 30000), CA = c(40000, 25000), AL = c(20000, 21000), ND = c(21000,60000)) Now I want to find out, which state produced most oil in a given year. I tried this: attach(oil) last_year = oil[ c(YEAR == 2012), ] max(last_year) Which works, but it doesnt't give me the corresponding values (i.e. it just gives me the maximum output, not what state its from). So I tried this: oil[c(oil == max(last_year)),] and this: oil[c(last_year == max(last_year)),] and this: oil[which.max(last_year),] and this: last_year[max(last_year),] None of them work, but they don't give error messages either, the output is just "NA". The problem is, in my eyes, that I'm comparing the values of different variables with each other. Because if i change the structure of the dataframe (which I can't do with the real data, at least not with out doing it by hand with a huge dataset), it looks like this and works perfectly: oil2 <- data.frame ( names = c('YEAR', 'TX', 'CA', 'AL', 'ND'), oil_2011 = c(2011, 20000, 40000, 20000, 21000), oil_2012 = c(2012, 30000, 25000, 21000, 60000) ) attach(oil2) oil2[c(oil_2012 == max(oil_2012)),] Any help is much appreciated. Thanks, Tim Umbach [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.