The sorting should have been by Lake, psd and vol (not what I had) so it should be revised to:
DFo <- DF[order(DF$Lake, DF$psd, DF$vol), ] aggregate(DFo[c("Length", "vol")], DFo[c("Lake", "psd")], tail, 1) This is the same as before except DF$psd is used in place of DF$Length in the first line. On Mon, Dec 22, 2008 at 9:14 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > Just sort the data first and then apply any of the solutions but with tail(x, > 1) > instead of max, e.g. > > DFo <- DF[order(DF$Lake, DF$Length, DF$vol), ] > aggregate(DFo[c("Length", "vol")], DFo[c("Lake", "psd")], tail, 1) > > > On Mon, Dec 22, 2008 at 8:15 PM, Ranney, Steven > <steven.ran...@montana.edu> wrote: >> Thank you all for your help. I appreciate the assistance. I'm thinking I >> should have been more specific in my original question. >> >> Unless I'm mistaken, all of the suggestions so far have been for maximum vol >> and maximum Length by Lake and psd. I'm trying to extract the max vol by >> Lake and psd along with the corresponding value of Length. So, instead of >> maximum vol and maximum Length, I'd like to find the max vol and the Length >> associated with that value. >> >> Sorry for any confusion, >> >> SR >> >> Steven H. Ranney >> Graduate Research Assistant (Ph.D) >> USGS Montana Cooperative Fishery Research Unit >> Montana State University >> P.O. Box 173460 >> Bozeman, MT 59717-3460 >> >> phone: (406) 994-6643 >> fax: (406) 994-7479 >> >> http://studentweb.montana.edu/steven.ranney >> ________________________________ >> >> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] >> Sent: Mon 12/22/2008 5:15 PM >> To: Ranney, Steven >> Cc: r-help@r-project.org >> Subject: Re: [R] Summary information by groups programming assitance >> >> >> Here are two solutions assuming DF is your data frame: >> >> # 1. aggregate is in the base of R >> >> aggregate(DF[c("Length", "vol")], DF[c("Lake", "psd")], max) >> >> or the following which is the same except it labels psd as Category: >> >> aggregate(DF[c("Length", "vol")], with(DF, list(Lake = Lake, Category >> = psd)), max) >> >> >> # 2. sqldf. The sqldf package allows specification using SQL notation: >> >> library|(sqldf) >> sqldf("select Lake, psd as Category, max(Length), max(vol) from DF >> group by Lake, psd") >> >> There are many other good solutions too using various packages which >> have already >> been mentioned on this thread. >> >> On Mon, Dec 22, 2008 at 4:51 PM, Ranney, Steven >> <steven.ran...@montana.edu> wrote: >>> All - >>> >>> I have data that looks like >>> >>> psd Species Lake Length Weight St.weight Wr >>> Wr.1 vol >>> 432 substock SMB Clear 150 41.00 0.01 95.12438 >>> 95.10118 0.0105 >>> 433 substock SMB Clear 152 39.00 0.01 86.72916 >>> 86.70692 0.0105 >>> 434 substock SMB Clear 152 40.00 3.11 88.95298 >>> 82.03689 3.2655 >>> 435 substock SMB Clear 159 48.00 0.04 92.42095 >>> 92.34393 0.0420 >>> 436 substock SMB Clear 159 48.00 0.01 92.42095 >>> 92.40170 0.0105 >>> 437 substock SMB Clear 165 47.00 0.03 80.38023 >>> 80.32892 0.0315 >>> 438 substock SMB Clear 171 62.00 0.21 94.58105 >>> 94.26070 0.2205 >>> 439 substock SMB Clear 178 70.00 0.01 93.91912 >>> 93.90571 0.0105 >>> 440 substock SMB Clear 179 76.00 1.38 100.15760 >>> 98.33895 1.4490 >>> 441 S-Q SMB Clear 180 75.00 0.01 97.09330 >>> 97.08035 0.0105 >>> 442 S-Q SMB Clear 180 92.00 0.02 119.10111 >>> 119.07522 0.0210 >>> ... >>> [truncated] >>> >>> where psd and lake are categorical variables, with five and four >>> categories, respectively. I'd like to find the maximum vol and the >>> lengths associated with each maximum vol by each category by each lake. >>> In other words, I'd like to have a data frame that looks something like >>> >>> Lake Category Length vol >>> Clear substock 152 3.2655 >>> Clear S-Q 266 11.73 >>> Clear Q-P 330 14.89 >>> ... >>> Pickerel substock 170 3.4965 >>> Pickerel S-Q 248 10.69 >>> Pickerel Q-P 335 25.62 >>> Pickerel P-M 415 32.62 >>> Pickerel M-T 442 17.25 >>> >>> >>> In order to originally get this, I used >>> >>> with(smb[Lake=="Clear",], tapply(vol, list(Length, psd),max)) >>> with(smb[Lake=="Enemy.Swim",], tapply(vol, list(Length, psd),max)) >>> with(smb[Lake=="Pickerel",], tapply(vol, list(Length, psd),max)) >>> with(smb[Lake=="Roy",], tapply(vol, list(Length, psd),max)) >>> >>> and pulled the values I needed out by hand and put them into a .csv. >>> Unfortunately, I've got a number of other data sets upon which I'll need >>> to do the same analysis. Finding a programmable alternative would >>> provide a much easier (and likely less error prone) method to achieve >>> the same results. Ideally, the "Length" and "vol" data would be in a >>> data frame such that I could then analyze with nls. >>> >>> Does anyone have any thoughts as to how I might accomplish this? >>> >>> Thanks in advance, >>> >>> Steven Ranney >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.