Thank you very much, Dan.
These work great. Two more great answers to my question.
Matthew
On 5/24/2016 4:15 PM, Nordlund, Dan (DSHS/RDA) wrote:
You have several options.
1. You could use the aggregate function. If your data frame is called DF, you
could do something like
with(DF, aggregate(Length, list(Identifier), mean))
2. You could use the dplyr package like this
library(dplyr)
summarize(group_by(DF, Identifier), mean(Length))
Hope this is helpful,
Dan
Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services
-----Original Message-----
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthew
Sent: Tuesday, May 24, 2016 12:47 PM
To: r-help@r-project.org
Subject: [R] identify duplicate entries in data frame and calculate mean
I have a data frame with 10 columns.
In the last column is an alphaneumaric identifier.
For most rows, this alphaneumaric identifier is unique to the file, however
some of these alphanemeric idenitifiers occur in duplicate, triplicate or more.
When they do occur more than once they are in consecutive rows, so when
there is a duplicate or triplicate or quadruplicate (let's call them
multiplicates),
they are in consecutive rows.
In column 7 there is an integer number (may or may not be unique. does not
matter).
I want to identify each multiple entries (multiplicates) occurring in column 10
and then for each multiplicate calculate the mean of the integers column 7.
As an example, I will show just two columns:
Length Identifier
321 A234
350 A234
340 A234
180 B123
198 B225
What I want to do (in the above example) is collapse all the A234's and report
the mean to get this:
Length Identifier
337 A234
180 B123
198 B225
Matthew
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.