Re: [R] Using apply to get group means

2009-03-31 Thread David Winsemius


That is precisely the reason for the existence of the ave function.  
Using Wickham's example:


> x1 <- rep(c("A", "B", "C"), 3)
> x2 <- c(rep(1, 3), rep(2, 3), 1, 2, 1)
> x3 <- c(1, 2, 3, 4, 5, 6, 2, 6, 4)
> df <- data.frame(x1, x2, x3)
> df$grpx3 <- ave(df$x3, list(x1,x2))
> df
  x1 x2 x3 grpx3
1  A  1  1   1.5
2  B  1  2   2.0
3  C  1  3   3.5
4  A  2  4   4.0
5  B  2  5   5.5
6  C  2  6   6.0
7  A  1  2   1.5
8  B  2  6   5.5
9  C  1  4   3.5

Note that the default function is mean() but other functions could be  
specified.



--
David Winsemius

On Mar 31, 2009, at 12:09 PM, Alan Cohen wrote:


Hi all,

I'm trying to improve my R skills and make my programming more  
efficient and succinct.  I can solve the following question, but  
wonder if there's a better way to do it:


I'm trying to calculate mean by several variables and then put this  
back into the original data set as a new variable.  For example, if  
I were measuring weight, I might want to have each individual's  
weight, and also the group mean by, say, race, sex, and geographic  
region.  The following code works:



x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-as.data.frame(cbind(x1,x2,x3))
x3.mean<-rep(0,nrow(x))
for (i in 1:nrow(x)){
+   x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2] 
[i]]))

+   }

cbind(x,x3.mean)

 x1 x2 x3 x3.mean
1  A  1  1 1.5
2  B  1  2 2.0
3  C  1  3 3.5
4  A  2  4 4.0
5  B  2  5 5.5
6  C  2  6 6.0
7  A  1  2 1.5
8  B  2  6 5.5
9  C  1  4 3.5

However, I'd love to be able to do this with "apply" rather than a  
for-loop.  Or is there a built-in function? Any suggestions?


Also, any way to avoid the hassles with having to convert to a data  
frame and then again to numeric when one variable is character?


Cheers,
Alan Cohen


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply to get group means

2009-03-31 Thread Domenico Vistocco

Sorry, there was a mistake in the previous mail:

Domenico Vistocco wrote:
A different solution (using aggregate for the table of means and merge 
for  adding it to the dataframe):


x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-data.frame(x1,x2,x3) #here using data.frame the x1 variable is 
directly converted to factor



x3means <- aggregate(x$x3, by=list(x$x1), FUN="mean")
merge(x, x3means, by.x="x1", by.y="Group.1")
#I forgot the second variable in the by argument (both for aggregate and 
by):

x3means <- aggregate(x$x3, by=list(x$x1, x$x2), FUN="mean")
merge(x, x3means, by.x=c("x1","x2"), by.y=c("Group.1", "Group.2"))




Ciao,
domenico

Alan Cohen wrote:

Hi all,

I'm trying to improve my R skills and make my programming more 
efficient and succinct.  I can solve the following question, but 
wonder if there's a better way to do it:


I'm trying to calculate mean by several variables and then put this 
back into the original data set as a new variable.  For example, if I 
were measuring weight, I might want to have each individual's weight, 
and also the group mean by, say, race, sex, and geographic region.  
The following code works:


 

x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-as.data.frame(cbind(x1,x2,x3))
x3.mean<-rep(0,nrow(x))
for (i in 1:nrow(x)){


+   x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]]))
+   }   

cbind(x,x3.mean)


  x1 x2 x3 x3.mean
1  A  1  1 1.5
2  B  1  2 2.0
3  C  1  3 3.5
4  A  2  4 4.0
5  B  2  5 5.5
6  C  2  6 6.0
7  A  1  2 1.5
8  B  2  6 5.5
9  C  1  4 3.5

However, I'd love to be able to do this with "apply" rather than a 
for-loop.  Or is there a built-in function? Any suggestions?


Also, any way to avoid the hassles with having to convert to a data 
frame and then again to numeric when one variable is character?


Cheers,
Alan Cohen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply to get group means

2009-03-31 Thread Domenico Vistocco
A different solution (using aggregate for the table of means and merge 
for  adding it to the dataframe):


x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-data.frame(x1,x2,x3) #here using data.frame the x1 variable is directly 
converted to factor


x3means <- aggregate(x$x3, by=list(x$x1), FUN="mean")
merge(x, x3means, by.x="x1", by.y="Group.1")


Ciao,
domenico

Alan Cohen wrote:

Hi all,

I'm trying to improve my R skills and make my programming more efficient and 
succinct.  I can solve the following question, but wonder if there's a better 
way to do it:

I'm trying to calculate mean by several variables and then put this back into 
the original data set as a new variable.  For example, if I were measuring 
weight, I might want to have each individual's weight, and also the group mean 
by, say, race, sex, and geographic region.  The following code works:

  

x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-as.data.frame(cbind(x1,x2,x3))
x3.mean<-rep(0,nrow(x))
for (i in 1:nrow(x)){


+   x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]]))
+   }  
  

cbind(x,x3.mean)


  x1 x2 x3 x3.mean
1  A  1  1 1.5
2  B  1  2 2.0
3  C  1  3 3.5
4  A  2  4 4.0
5  B  2  5 5.5
6  C  2  6 6.0
7  A  1  2 1.5
8  B  2  6 5.5
9  C  1  4 3.5

However, I'd love to be able to do this with "apply" rather than a for-loop.  
Or is there a built-in function? Any suggestions?

Also, any way to avoid the hassles with having to convert to a data frame and 
then again to numeric when one variable is character?

Cheers,
Alan Cohen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply to get group means

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie  wrote:
> Not exactly the output you asked for, but perhaps you can consider,
>
> library(doBy)
>> summaryBy(x3~x2+x1,data=x,FUN=mean)
>>
>>  x2 x1 x3.mean
>> 1  1  A     1.5
>> 2  1  B     2.0
>> 3  1  C     3.5
>> 4  2  A     4.0
>> 5  2  B     5.5
>> 6  2  C     6.0
>
>
> the plyr package also provides similar functionality, as do the ?by, ?ave,
> and ?tapply base functions.

In plyr it would look like:

x1 <- rep(c("A", "B", "C"), 3)
x2 <- c(rep(1, 3), rep(2, 3), 1, 2, 1)
x3 <- c(1, 2, 3, 4, 5, 6, 2, 6, 4)
df <- data.frame(x1, x2, x3)

ddply(df, .(x1, x2), transform, x3.mean = mean(x3))

Note how I created the data frame - only use cbind if you want a
matrix (i.e. all the columns have the same type)

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply to get group means

2009-03-31 Thread baptiste auguie

Not exactly the output you asked for, but perhaps you can consider,

library(doBy)
> summaryBy(x3~x2+x1,data=x,FUN=mean)

  x2 x1 x3.mean
1  1  A 1.5
2  1  B 2.0
3  1  C 3.5
4  2  A 4.0
5  2  B 5.5
6  2  C 6.0



the plyr package also provides similar functionality, as do the ?by, ? 
ave, and ?tapply base functions.


HTH,

baptiste


On 31 Mar 2009, at 17:09, Alan Cohen wrote:


Hi all,

I'm trying to improve my R skills and make my programming more  
efficient and succinct.  I can solve the following question, but  
wonder if there's a better way to do it:


I'm trying to calculate mean by several variables and then put this  
back into the original data set as a new variable.  For example, if  
I were measuring weight, I might want to have each individual's  
weight, and also the group mean by, say, race, sex, and geographic  
region.  The following code works:



x1<-rep(c("A","B","C"),3)
x2<-c(rep(1,3),rep(2,3),1,2,1)
x3<-c(1,2,3,4,5,6,2,6,4)
x<-as.data.frame(cbind(x1,x2,x3))
x3.mean<-rep(0,nrow(x))
for (i in 1:nrow(x)){
+   x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2] 
[i]]))

+   }

cbind(x,x3.mean)

 x1 x2 x3 x3.mean
1  A  1  1 1.5
2  B  1  2 2.0
3  C  1  3 3.5
4  A  2  4 4.0
5  B  2  5 5.5
6  C  2  6 6.0
7  A  1  2 1.5
8  B  2  6 5.5
9  C  1  4 3.5

However, I'd love to be able to do this with "apply" rather than a  
for-loop.  Or is there a built-in function? Any suggestions?


Also, any way to avoid the hassles with having to convert to a data  
frame and then again to numeric when one variable is character?


Cheers,
Alan Cohen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using apply to get group means

2009-03-31 Thread Alan Cohen
Hi all,

I'm trying to improve my R skills and make my programming more efficient and 
succinct.  I can solve the following question, but wonder if there's a better 
way to do it:

I'm trying to calculate mean by several variables and then put this back into 
the original data set as a new variable.  For example, if I were measuring 
weight, I might want to have each individual's weight, and also the group mean 
by, say, race, sex, and geographic region.  The following code works:

> x1<-rep(c("A","B","C"),3)
> x2<-c(rep(1,3),rep(2,3),1,2,1)
> x3<-c(1,2,3,4,5,6,2,6,4)
> x<-as.data.frame(cbind(x1,x2,x3))
> x3.mean<-rep(0,nrow(x))
> for (i in 1:nrow(x)){
+   x3.mean[i]<-mean(as.numeric(x[,3][x[,1]==x[,1][i]&x[,2]==x[,2][i]]))
+   }  
> cbind(x,x3.mean)
  x1 x2 x3 x3.mean
1  A  1  1 1.5
2  B  1  2 2.0
3  C  1  3 3.5
4  A  2  4 4.0
5  B  2  5 5.5
6  C  2  6 6.0
7  A  1  2 1.5
8  B  2  6 5.5
9  C  1  4 3.5

However, I'd love to be able to do this with "apply" rather than a for-loop.  
Or is there a built-in function? Any suggestions?

Also, any way to avoid the hassles with having to convert to a data frame and 
then again to numeric when one variable is character?

Cheers,
Alan Cohen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.