Re: [R] Using apply to get group means
That is precisely the reason for the existence of the ave function. Using Wickham's example: > x1 <- rep(c("A", "B", "C"), 3) > x2 <- c(rep(1, 3), rep(2, 3), 1, 2, 1) > x3 <- c(1, 2, 3, 4, 5, 6, 2, 6, 4) > df <- data.frame(x1, x2, x3) > df$grpx3 <- ave(df$x3, list(x1,x2)) > df x1 x2 x3 grpx3 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 Note that the default function is mean() but other functions could be specified. -- David Winsemius On Mar 31, 2009, at 12:09 PM, Alan Cohen wrote: Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's weight, and also the group mean by, say, race, sex, and geographic region. The following code works: x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-as.data.frame(cbind(x1,x2,x3)) x3.mean<-rep(0,nrow(x)) for (i in 1:nrow(x)){ + x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2] [i]])) + } cbind(x,x3.mean) x1 x2 x3 x3.mean 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 However, I'd love to be able to do this with "apply" rather than a for-loop. Or is there a built-in function? Any suggestions? Also, any way to avoid the hassles with having to convert to a data frame and then again to numeric when one variable is character? Cheers, Alan Cohen David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply to get group means
Sorry, there was a mistake in the previous mail: Domenico Vistocco wrote: A different solution (using aggregate for the table of means and merge for adding it to the dataframe): x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-data.frame(x1,x2,x3) #here using data.frame the x1 variable is directly converted to factor x3means <- aggregate(x$x3, by=list(x$x1), FUN="mean") merge(x, x3means, by.x="x1", by.y="Group.1") #I forgot the second variable in the by argument (both for aggregate and by): x3means <- aggregate(x$x3, by=list(x$x1, x$x2), FUN="mean") merge(x, x3means, by.x=c("x1","x2"), by.y=c("Group.1", "Group.2")) Ciao, domenico Alan Cohen wrote: Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's weight, and also the group mean by, say, race, sex, and geographic region. The following code works: x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-as.data.frame(cbind(x1,x2,x3)) x3.mean<-rep(0,nrow(x)) for (i in 1:nrow(x)){ + x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]])) + } cbind(x,x3.mean) x1 x2 x3 x3.mean 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 However, I'd love to be able to do this with "apply" rather than a for-loop. Or is there a built-in function? Any suggestions? Also, any way to avoid the hassles with having to convert to a data frame and then again to numeric when one variable is character? Cheers, Alan Cohen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply to get group means
A different solution (using aggregate for the table of means and merge for adding it to the dataframe): x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-data.frame(x1,x2,x3) #here using data.frame the x1 variable is directly converted to factor x3means <- aggregate(x$x3, by=list(x$x1), FUN="mean") merge(x, x3means, by.x="x1", by.y="Group.1") Ciao, domenico Alan Cohen wrote: Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's weight, and also the group mean by, say, race, sex, and geographic region. The following code works: x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-as.data.frame(cbind(x1,x2,x3)) x3.mean<-rep(0,nrow(x)) for (i in 1:nrow(x)){ + x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]])) + } cbind(x,x3.mean) x1 x2 x3 x3.mean 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 However, I'd love to be able to do this with "apply" rather than a for-loop. Or is there a built-in function? Any suggestions? Also, any way to avoid the hassles with having to convert to a data frame and then again to numeric when one variable is character? Cheers, Alan Cohen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply to get group means
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie wrote: > Not exactly the output you asked for, but perhaps you can consider, > > library(doBy) >> summaryBy(x3~x2+x1,data=x,FUN=mean) >> >> x2 x1 x3.mean >> 1 1 A 1.5 >> 2 1 B 2.0 >> 3 1 C 3.5 >> 4 2 A 4.0 >> 5 2 B 5.5 >> 6 2 C 6.0 > > > the plyr package also provides similar functionality, as do the ?by, ?ave, > and ?tapply base functions. In plyr it would look like: x1 <- rep(c("A", "B", "C"), 3) x2 <- c(rep(1, 3), rep(2, 3), 1, 2, 1) x3 <- c(1, 2, 3, 4, 5, 6, 2, 6, 4) df <- data.frame(x1, x2, x3) ddply(df, .(x1, x2), transform, x3.mean = mean(x3)) Note how I created the data frame - only use cbind if you want a matrix (i.e. all the columns have the same type) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply to get group means
Not exactly the output you asked for, but perhaps you can consider, library(doBy) > summaryBy(x3~x2+x1,data=x,FUN=mean) x2 x1 x3.mean 1 1 A 1.5 2 1 B 2.0 3 1 C 3.5 4 2 A 4.0 5 2 B 5.5 6 2 C 6.0 the plyr package also provides similar functionality, as do the ?by, ? ave, and ?tapply base functions. HTH, baptiste On 31 Mar 2009, at 17:09, Alan Cohen wrote: Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's weight, and also the group mean by, say, race, sex, and geographic region. The following code works: x1<-rep(c("A","B","C"),3) x2<-c(rep(1,3),rep(2,3),1,2,1) x3<-c(1,2,3,4,5,6,2,6,4) x<-as.data.frame(cbind(x1,x2,x3)) x3.mean<-rep(0,nrow(x)) for (i in 1:nrow(x)){ + x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2] [i]])) + } cbind(x,x3.mean) x1 x2 x3 x3.mean 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 However, I'd love to be able to do this with "apply" rather than a for-loop. Or is there a built-in function? Any suggestions? Also, any way to avoid the hassles with having to convert to a data frame and then again to numeric when one variable is character? Cheers, Alan Cohen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using apply to get group means
Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's weight, and also the group mean by, say, race, sex, and geographic region. The following code works: > x1<-rep(c("A","B","C"),3) > x2<-c(rep(1,3),rep(2,3),1,2,1) > x3<-c(1,2,3,4,5,6,2,6,4) > x<-as.data.frame(cbind(x1,x2,x3)) > x3.mean<-rep(0,nrow(x)) > for (i in 1:nrow(x)){ + x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]])) + } > cbind(x,x3.mean) x1 x2 x3 x3.mean 1 A 1 1 1.5 2 B 1 2 2.0 3 C 1 3 3.5 4 A 2 4 4.0 5 B 2 5 5.5 6 C 2 6 6.0 7 A 1 2 1.5 8 B 2 6 5.5 9 C 1 4 3.5 However, I'd love to be able to do this with "apply" rather than a for-loop. Or is there a built-in function? Any suggestions? Also, any way to avoid the hassles with having to convert to a data frame and then again to numeric when one variable is character? Cheers, Alan Cohen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.