Thanks, Tom. I was making a mistake looking at your example and that's what my problem was.
Cool answer, works great. Thank you very much. Matthew On 5/24/2016 4:23 PM, Tom Wright wrote: > Don't see that as being a big problem. If your data grows then dplyr > supports connections to external databases. Alternately if you just > want a mean, most databases can do that directly in SQL. > > On Tue, May 24, 2016 at 4:17 PM, Matthew > <mccorm...@molbio.mgh.harvard.edu > <mailto:mccorm...@molbio.mgh.harvard.edu>> wrote: > > Thank you very much, Tom. > This gets me thinking in the right direction. > One thing I should have mentioned that I did not is that the > number of rows in the data frame will be a little over 40,000 rows. > > > On 5/24/2016 4:08 PM, Tom Wright wrote: >> Using dplyr >> >> $ library(dplyr) >> $ x<-data.frame(Length=c(321,350,340,180,198), >> ID=c(rep('A234',3),'B123','B225') ) >> $ x %>% group_by(ID) %>% summarise(m=mean(Length)) >> >> >> >> On Tue, May 24, 2016 at 3:46 PM, Matthew >> <mccorm...@molbio.mgh.harvard.edu >> <mailto:mccorm...@molbio.mgh.harvard.edu>> wrote: >> >> I have a data frame with 10 columns. >> In the last column is an alphaneumaric identifier. >> For most rows, this alphaneumaric identifier is unique to the >> file, however some of these alphanemeric idenitifiers occur >> in duplicate, triplicate or more. When they do occur more >> than once they are in consecutive rows, so when there is a >> duplicate or triplicate or quadruplicate (let's call them >> multiplicates), they are in consecutive rows. >> >> In column 7 there is an integer number (may or may not be >> unique. does not matter). >> >> I want to identify each multiple entries (multiplicates) >> occurring in column 10 and then for each multiplicate >> calculate the mean of the integers column 7. >> >> As an example, I will show just two columns: >> Length Identifier >> 321 A234 >> 350 A234 >> 340 A234 >> 180 B123 >> 198 B225 >> >> What I want to do (in the above example) is collapse all the >> A234's and report the mean to get this: >> Length Identifier >> 337 A234 >> 180 B123 >> 198 B225 >> >> >> Matthew >> >> ______________________________________________ >> R-help@r-project.org <mailto:R-help@r-project.org> mailing >> list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible >> code. >> >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.