[R] merging with aggregating
Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks There might be a somewhat easier way, but here is one approach: # Use merge() to join A and B on 'n' # Set all = TRUE to include non-matched rows C - merge(A, B, by = n, all = TRUE) C n V1.x V2.x V1.y V2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA # Now get the rowSums() for the V1/V2 column pairs # and create a new dataframe from the # results AB - data.frame(n = C$n, V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), V2 = rowSums(C[, c(3, 5)], na.rm = TRUE)) AB n V1 V2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 See ?merge and ?rowSums for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
Hi all, the moment you hit the 'send' button you know the answer... I approached a solution similar to this one given by Marc. But maybe there is a better one? Even because this operation is done in a for-loop during which R gets new data from a database. So I sum up 16 data.frames eventually. Dubro -Ursprüngliche Nachricht- Von: Marc Schwartz [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 6. Dezember 2005 15:11 An: Dubravko Dolic Cc: r-help@stat.math.ethz.ch Betreff: Re: [R] merging with aggregating On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks There might be a somewhat easier way, but here is one approach: # Use merge() to join A and B on 'n' # Set all = TRUE to include non-matched rows C - merge(A, B, by = n, all = TRUE) C n V1.x V2.x V1.y V2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA # Now get the rowSums() for the V1/V2 column pairs # and create a new dataframe from the # results AB - data.frame(n = C$n, V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), V2 = rowSums(C[, c(3, 5)], na.rm = TRUE)) AB n V1 V2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 See ?merge and ?rowSums for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) m.all - merge(m1, m2, by=n, all=T) n v1.x v2.x v1.y v2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA Then depending on how many such columns there are, you have a number of ways of aggregating this dataset. One such way is cbind( n=m.all[ , n], v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) ) ], na.rm=T ), v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) ) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Regards, Adai On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
On Tue, 2005-12-06 at 15:19 +0100, Dubravko Dolic wrote: Hi all, the moment you hit the 'send' button you know the answer... I approached a solution similar to this one given by Marc. But maybe there is a better one? Even because this operation is done in a for-loop during which R gets new data from a database. So I sum up 16 data.frames eventually. Dubro SNIP OKso here is one possible approach to a more generic solution: # Preallocate a list with 16 elements DF.List - replicate(16, list(numeric(0))) DF.List looks like: head(DF.List) [[1]] numeric(0) [[2]] numeric(0) [[3]] numeric(0) [[4]] numeric(0) ... # Do your loop here, placing the actual results # of your queries into DF.List[[i]]. I am just using # random samples here for the example. # NOTE: I am making the assumption in this example # that each resultant DF will have the same structure. for (i in 1:16) { DF.List[[i]] - data.frame(n = sample(20, 10), V1 = sample(20, 10), V2 = sample(0:10, 10)) } # Now rbind() the data frames together DF.All - do.call(rbind, DF.List) # Now do use aggregate() to get the sums of V1 and V2 # by 'n'. DF.Sums - aggregate(DF.All[, c(V1, V2)], list(n = DF.All$n), sum) DF.Sums n V1 V2 1 1 161 65 2 2 86 67 3 3 72 28 4 4 59 31 5 5 101 48 6 6 68 41 7 7 75 34 8 8 73 30 9 9 59 26 10 10 80 16 11 11 127 44 12 12 111 78 13 13 111 38 14 14 69 28 15 15 71 26 16 16 90 51 17 17 50 36 18 18 48 41 19 19 92 38 20 20 71 22 Does that get closer to what you need? HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
Here's a solution that uses aggregate(), as suggested in the subject of this thread. m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) tt - as.data.frame(rbind(m1,m2)) aggregate(list(v1=tt$v1,v2=tt$v2),by=list(n=tt$n),sum) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Cheers, Pierre Adaikalavan Ramasamy offered the following remark on 12/06/05 04:40... m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) m.all - merge(m1, m2, by=n, all=T) n v1.x v2.x v1.y v2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA Then depending on how many such columns there are, you have a number of ways of aggregating this dataset. One such way is cbind( n=m.all[ , n], v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) ) ], na.rm=T ), v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) ) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Regards, Adai On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- - Pierre Kleiber, Ph.D Email: [EMAIL PROTECTED] Fishery BiologistTel: 808 983-5399 / (hm)808 737-7544 NOAA Fisheries Service - Honolulu LaboratoryFax: 808 983-2902 2570 Dole St., Honolulu, HI 96822-2396 - God could have told Moses about galaxies and mitochondria and all. But behold... It was good enough for government work. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html