Just sort the data first and then apply any of the solutions but with tail(x, 1) instead of max, e.g.
DFo <- DF[order(DF$Lake, DF$Length, DF$vol), ] aggregate(DFo[c("Length", "vol")], DFo[c("Lake", "psd")], tail, 1) On Mon, Dec 22, 2008 at 8:15 PM, Ranney, Steven <steven.ran...@montana.edu> wrote: > Thank you all for your help. I appreciate the assistance. I'm thinking I > should have been more specific in my original question. > > Unless I'm mistaken, all of the suggestions so far have been for maximum vol > and maximum Length by Lake and psd. I'm trying to extract the max vol by > Lake and psd along with the corresponding value of Length. So, instead of > maximum vol and maximum Length, I'd like to find the max vol and the Length > associated with that value. > > Sorry for any confusion, > > SR > > Steven H. Ranney > Graduate Research Assistant (Ph.D) > USGS Montana Cooperative Fishery Research Unit > Montana State University > P.O. Box 173460 > Bozeman, MT 59717-3460 > > phone: (406) 994-6643 > fax: (406) 994-7479 > > http://studentweb.montana.edu/steven.ranney > ________________________________ > > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Mon 12/22/2008 5:15 PM > To: Ranney, Steven > Cc: r-help@r-project.org > Subject: Re: [R] Summary information by groups programming assitance > > > Here are two solutions assuming DF is your data frame: > > # 1. aggregate is in the base of R > > aggregate(DF[c("Length", "vol")], DF[c("Lake", "psd")], max) > > or the following which is the same except it labels psd as Category: > > aggregate(DF[c("Length", "vol")], with(DF, list(Lake = Lake, Category > = psd)), max) > > > # 2. sqldf. The sqldf package allows specification using SQL notation: > > library|(sqldf) > sqldf("select Lake, psd as Category, max(Length), max(vol) from DF > group by Lake, psd") > > There are many other good solutions too using various packages which > have already > been mentioned on this thread. > > On Mon, Dec 22, 2008 at 4:51 PM, Ranney, Steven > <steven.ran...@montana.edu> wrote: >> All - >> >> I have data that looks like >> >> psd Species Lake Length Weight St.weight Wr >> Wr.1 vol >> 432 substock SMB Clear 150 41.00 0.01 95.12438 >> 95.10118 0.0105 >> 433 substock SMB Clear 152 39.00 0.01 86.72916 >> 86.70692 0.0105 >> 434 substock SMB Clear 152 40.00 3.11 88.95298 >> 82.03689 3.2655 >> 435 substock SMB Clear 159 48.00 0.04 92.42095 >> 92.34393 0.0420 >> 436 substock SMB Clear 159 48.00 0.01 92.42095 >> 92.40170 0.0105 >> 437 substock SMB Clear 165 47.00 0.03 80.38023 >> 80.32892 0.0315 >> 438 substock SMB Clear 171 62.00 0.21 94.58105 >> 94.26070 0.2205 >> 439 substock SMB Clear 178 70.00 0.01 93.91912 >> 93.90571 0.0105 >> 440 substock SMB Clear 179 76.00 1.38 100.15760 >> 98.33895 1.4490 >> 441 S-Q SMB Clear 180 75.00 0.01 97.09330 >> 97.08035 0.0105 >> 442 S-Q SMB Clear 180 92.00 0.02 119.10111 >> 119.07522 0.0210 >> ... >> [truncated] >> >> where psd and lake are categorical variables, with five and four >> categories, respectively. I'd like to find the maximum vol and the >> lengths associated with each maximum vol by each category by each lake. >> In other words, I'd like to have a data frame that looks something like >> >> Lake Category Length vol >> Clear substock 152 3.2655 >> Clear S-Q 266 11.73 >> Clear Q-P 330 14.89 >> ... >> Pickerel substock 170 3.4965 >> Pickerel S-Q 248 10.69 >> Pickerel Q-P 335 25.62 >> Pickerel P-M 415 32.62 >> Pickerel M-T 442 17.25 >> >> >> In order to originally get this, I used >> >> with(smb[Lake=="Clear",], tapply(vol, list(Length, psd),max)) >> with(smb[Lake=="Enemy.Swim",], tapply(vol, list(Length, psd),max)) >> with(smb[Lake=="Pickerel",], tapply(vol, list(Length, psd),max)) >> with(smb[Lake=="Roy",], tapply(vol, list(Length, psd),max)) >> >> and pulled the values I needed out by hand and put them into a .csv. >> Unfortunately, I've got a number of other data sets upon which I'll need >> to do the same analysis. Finding a programmable alternative would >> provide a much easier (and likely less error prone) method to achieve >> the same results. Ideally, the "Length" and "vol" data would be in a >> data frame such that I could then analyze with nls. >> >> Does anyone have any thoughts as to how I might accomplish this? >> >> Thanks in advance, >> >> Steven Ranney >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.