Re: [R] multivariate version of aggregate
Yes, I had a look at that function. From the documentation, however, it did not get clear to me how to split the dataframe into subsets of rows based on an index argument. Like: testframe - data.frame(a=rnorm(100), b = rnorm(100)) indices - rep(c(1,2), each = 50) results - ddply(.data = testframe, INDICES= indices, .fun = function(x) corr(x[,1], x[,2])) Where the last command would yield the correlations between column 1 and 2 of the first 50 and of the last 50 values. Any ideas? Jannis On 27.06.2013 21:43, Greg Snow wrote: Look at the plyr package, probably the ddply function in that package. You can write your own function to do whatever you want on the pieces of the split apart object. Correlation between a specified pair of columns would be simple. On Thu, Jun 27, 2013 at 11:26 AM, Jannis bt_jan...@yahoo.de wrote: Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Hello, You can solve your problem using only base R, with no need for an external package. The two instrucitons below are two ways of doing the same. sapply(split(testframe, indices), function(x) cor(x[, 1], x[, 2])) as.vector(by(testframe, indices, function(x) cor(x[, 1], x[, 2]))) Hope this helps, Rui Barradas Em 28-06-2013 09:31, Jannis escreveu: Yes, I had a look at that function. From the documentation, however, it did not get clear to me how to split the dataframe into subsets of rows based on an index argument. Like: testframe - data.frame(a=rnorm(100), b = rnorm(100)) indices - rep(c(1,2), each = 50) results - ddply(.data = testframe, INDICES= indices, .fun = function(x) corr(x[,1], x[,2])) Where the last command would yield the correlations between column 1 and 2 of the first 50 and of the last 50 values. Any ideas? Jannis On 27.06.2013 21:43, Greg Snow wrote: Look at the plyr package, probably the ddply function in that package. You can write your own function to do whatever you want on the pieces of the split apart object. Correlation between a specified pair of columns would be simple. On Thu, Jun 27, 2013 at 11:26 AM, Jannis bt_jan...@yahoo.de wrote: Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Thanks a lot to everybody who responded! My solution now looks similar to Ruis and Davids suggestions. Jannis On 28.06.2013 11:00, Rui Barradas wrote: Hello, You can solve your problem using only base R, with no need for an external package. The two instrucitons below are two ways of doing the same. sapply(split(testframe, indices), function(x) cor(x[, 1], x[, 2])) as.vector(by(testframe, indices, function(x) cor(x[, 1], x[, 2]))) Hope this helps, Rui Barradas Em 28-06-2013 09:31, Jannis escreveu: Yes, I had a look at that function. From the documentation, however, it did not get clear to me how to split the dataframe into subsets of rows based on an index argument. Like: testframe - data.frame(a=rnorm(100), b = rnorm(100)) indices - rep(c(1,2), each = 50) results - ddply(.data = testframe, INDICES= indices, .fun = function(x) corr(x[,1], x[,2])) Where the last command would yield the correlations between column 1 and 2 of the first 50 and of the last 50 values. Any ideas? Jannis On 27.06.2013 21:43, Greg Snow wrote: Look at the plyr package, probably the ddply function in that package. You can write your own function to do whatever you want on the pieces of the split apart object. Correlation between a specified pair of columns would be simple. On Thu, Jun 27, 2013 at 11:26 AM, Jannis bt_jan...@yahoo.de wrote: Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Hi, set.seed(45) testframe - data.frame(a=rnorm(100), b = rnorm(100)) indices - rep(c(1,2), each = 50) library(plyr) ddply(testframe,.(indices),summarize, Cor1=cor(a,b)) # indices Cor1 #1 1 0.002770524 #2 2 -0.10173 A.K. - Original Message - From: Jannis bt_jan...@yahoo.de To: Greg Snow 538...@gmail.com Cc: r-help r-help@r-project.org Sent: Friday, June 28, 2013 4:31 AM Subject: Re: [R] multivariate version of aggregate Yes, I had a look at that function. From the documentation, however, it did not get clear to me how to split the dataframe into subsets of rows based on an index argument. Like: testframe - data.frame(a=rnorm(100), b = rnorm(100)) indices - rep(c(1,2), each = 50) results - ddply(.data = testframe, INDICES= indices, .fun = function(x) corr(x[,1], x[,2])) Where the last command would yield the correlations between column 1 and 2 of the first 50 and of the last 50 values. Any ideas? Jannis On 27.06.2013 21:43, Greg Snow wrote: Look at the plyr package, probably the ddply function in that package. You can write your own function to do whatever you want on the pieces of the split apart object. Correlation between a specified pair of columns would be simple. On Thu, Jun 27, 2013 at 11:26 AM, Jannis bt_jan...@yahoo.de wrote: Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate version of aggregate
Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
You can pass a matrix to by() set.seed(42) dat - data.frame(x=runif(50)*20, y=runif(50)*20, g=rep(LETTERS[1:2], each=25)) as.vector(by(dat[,1:2], dat$g, function(x) cor(x)[1,2])) [1] -0.05643063 0.16465040 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jannis Sent: Thursday, June 27, 2013 12:27 PM To: r-help Subject: [R] multivariate version of aggregate Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Look at the plyr package, probably the ddply function in that package. You can write your own function to do whatever you want on the pieces of the split apart object. Correlation between a specified pair of columns would be simple. On Thu, Jun 27, 2013 at 11:26 AM, Jannis bt_jan...@yahoo.de wrote: Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Hello, Or use ?sapply. sapply(split(dat[1:2], dat[3]), function(x) cor(x[1], x[2])) Hope this helps, Rui Barradas Em 27-06-2013 20:22, David Carlson escreveu: You can pass a matrix to by() set.seed(42) dat - data.frame(x=runif(50)*20, y=runif(50)*20, g=rep(LETTERS[1:2], each=25)) as.vector(by(dat[,1:2], dat$g, function(x) cor(x)[1,2])) [1] -0.05643063 0.16465040 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jannis Sent: Thursday, June 27, 2013 12:27 PM To: r-help Subject: [R] multivariate version of aggregate Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate version of aggregate
Hi, May be this also helps: library(data.table) dt1- data.table(dat) dt1[,cor(x,y),by=g] # g V1 #1: A -0.05643063 #2: B 0.16465040 dt1[,cor(x,y),by=g]$V1 #[1] -0.05643063 0.16465040 A.K. - Original Message - From: David Carlson dcarl...@tamu.edu To: 'Jannis' bt_jan...@yahoo.de; 'r-help' r-help@r-project.org Cc: Sent: Thursday, June 27, 2013 3:22 PM Subject: Re: [R] multivariate version of aggregate You can pass a matrix to by() set.seed(42) dat - data.frame(x=runif(50)*20, y=runif(50)*20, g=rep(LETTERS[1:2], each=25)) as.vector(by(dat[,1:2], dat$g, function(x) cor(x)[1,2])) [1] -0.05643063 0.16465040 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jannis Sent: Thursday, June 27, 2013 12:27 PM To: r-help Subject: [R] multivariate version of aggregate Dear List members, i am seeking a multivariate version of aggregate. I want to compute, fro example the correlation between subsets of two vectors. In aggregate, i can only supply one vector with indices for subsets. Is there ready function for this or do i need to program my own? Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.