[R] merging with aggregating

2005-12-06 Thread Dubravko Dolic
Dear List,

I have two data.frame of the following form:

A:

n  V1 V2
1  12  0 
2  10  8
3   3  8 
4   8  4
6   7  3  
7  12  0 
8   1  0 
9  18  0 
10  1  0
13  2  0

B:

n  V1 V2
1   0  2
2   0  3
3   1  9
4  12  8 
5   2  9
6   2  9
8   2  0
10  4  1
11  7  1
12  0  1


Now I want to merge those frame to one data.frame with summing up the
columns V1 and V2 but not the column n. So the result in this example
would be:

AB:

n  V1 V2
1  12  2
2  10 11 
3   4 17
4  20 12
5   2  9
6   9 12
7  12  0
8   3  0
9  18  0
10  5  1
11  7  1
12  0  1
13  2  0 


So Columns V1 and V2 are the sum of A und B while n has its old value.
Notice that there are different rows in n of A and B.

I don't have a clue how to start here. Any hint is welcome.

Thanks

Dubravko Dolic
Munich
Germany

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging with aggregating

2005-12-06 Thread Marc Schwartz
On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote:
 Dear List,
 
 I have two data.frame of the following form:
 
 A:
 
 n  V1 V2
 1  12  0 
 2  10  8
 3   3  8 
 4   8  4
 6   7  3  
 7  12  0 
 8   1  0 
 9  18  0 
 10  1  0
 13  2  0
 
 B:
 
 n  V1 V2
 1   0  2
 2   0  3
 3   1  9
 4  12  8 
 5   2  9
 6   2  9
 8   2  0
 10  4  1
 11  7  1
 12  0  1
 
 
 Now I want to merge those frame to one data.frame with summing up the
 columns V1 and V2 but not the column n. So the result in this example
 would be:
 
 AB:
 
 n  V1 V2
 1  12  2
 2  10 11 
 3   4 17
 4  20 12
 5   2  9
 6   9 12
 7  12  0
 8   3  0
 9  18  0
 10  5  1
 11  7  1
 12  0  1
 13  2  0 
 
 
 So Columns V1 and V2 are the sum of A und B while n has its old value.
 Notice that there are different rows in n of A and B.
 
 I don't have a clue how to start here. Any hint is welcome.
 
 Thanks

There might be a somewhat easier way, but here is one approach:

# Use merge() to join A and B on 'n'
# Set all = TRUE to include non-matched rows

 C - merge(A, B, by = n, all = TRUE)

 C
n V1.x V2.x V1.y V2.y
1   1   12002
2   2   10803
3   33819
4   484   128
5   5   NA   NA29
6   67329
7   7   120   NA   NA
8   81020
9   9   180   NA   NA
10 101041
11 11   NA   NA71
12 12   NA   NA01
13 1320   NA   NA


# Now get the rowSums() for the V1/V2 column pairs
# and create a new dataframe from the 
# results

 AB - data.frame(n = C$n, 
   V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), 
   V2 = rowSums(C[, c(3, 5)], na.rm = TRUE))


 AB
n V1 V2
1   1 12  2
2   2 10 11
3   3  4 17
4   4 20 12
5   5  2  9
6   6  9 12
7   7 12  0
8   8  3  0
9   9 18  0
10 10  5  1
11 11  7  1
12 12  0  1
13 13  2  0


See ?merge and ?rowSums for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging with aggregating

2005-12-06 Thread Dubravko Dolic
Hi all,

the moment you hit the 'send' button you know the answer...

I approached a solution similar to this one given by Marc. But maybe there is a 
better one? Even because this operation is done in a for-loop during which R 
gets new data from a database. So I sum up 16 data.frames eventually.

Dubro


-Ursprüngliche Nachricht-
Von: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 6. Dezember 2005 15:11
An: Dubravko Dolic
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] merging with aggregating

On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote:
 Dear List,
 
 I have two data.frame of the following form:
 
 A:
 
 n  V1 V2
 1  12  0 
 2  10  8
 3   3  8 
 4   8  4
 6   7  3  
 7  12  0 
 8   1  0 
 9  18  0 
 10  1  0
 13  2  0
 
 B:
 
 n  V1 V2
 1   0  2
 2   0  3
 3   1  9
 4  12  8 
 5   2  9
 6   2  9
 8   2  0
 10  4  1
 11  7  1
 12  0  1
 
 
 Now I want to merge those frame to one data.frame with summing up the
 columns V1 and V2 but not the column n. So the result in this example
 would be:
 
 AB:
 
 n  V1 V2
 1  12  2
 2  10 11 
 3   4 17
 4  20 12
 5   2  9
 6   9 12
 7  12  0
 8   3  0
 9  18  0
 10  5  1
 11  7  1
 12  0  1
 13  2  0 
 
 
 So Columns V1 and V2 are the sum of A und B while n has its old value.
 Notice that there are different rows in n of A and B.
 
 I don't have a clue how to start here. Any hint is welcome.
 
 Thanks

There might be a somewhat easier way, but here is one approach:

# Use merge() to join A and B on 'n'
# Set all = TRUE to include non-matched rows

 C - merge(A, B, by = n, all = TRUE)

 C
n V1.x V2.x V1.y V2.y
1   1   12002
2   2   10803
3   33819
4   484   128
5   5   NA   NA29
6   67329
7   7   120   NA   NA
8   81020
9   9   180   NA   NA
10 101041
11 11   NA   NA71
12 12   NA   NA01
13 1320   NA   NA


# Now get the rowSums() for the V1/V2 column pairs
# and create a new dataframe from the 
# results

 AB - data.frame(n = C$n, 
   V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), 
   V2 = rowSums(C[, c(3, 5)], na.rm = TRUE))


 AB
n V1 V2
1   1 12  2
2   2 10 11
3   3  4 17
4   4 20 12
5   5  2  9
6   6  9 12
7   7 12  0
8   8  3  0
9   9 18  0
10 10  5  1
11 11  7  1
12 12  0  1
13 13  2  0


See ?merge and ?rowSums for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging with aggregating

2005-12-06 Thread Adaikalavan Ramasamy

m1 - cbind(  n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2),
 v2=c(0,8,8,4,3,0,0,0,0,0) )

m2 - cbind(  n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0),
 v2=c(2,3,9,8,9,9,0,1,1,1) )

m.all - merge(m1, m2, by=n, all=T)

n v1.x v2.x v1.y v2.y
1   1   12002
2   2   10803
3   33819
4   484   128
5   5   NA   NA29
6   67329
7   7   120   NA   NA
8   81020
9   9   180   NA   NA
10 101041
11 11   NA   NA71
12 12   NA   NA01
13 1320   NA   NA

Then depending on how many such columns there are, you have a number of
ways of aggregating this dataset. One such way is

cbind( n=m.all[ , n], 
  v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) )  ], na.rm=T ),
  v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) )

n v1 v2
1   1 12  2
2   2 10 11
3   3  4 17
4   4 20 12
5   5  2  9
6   6  9 12
7   7 12  0
8   8  3  0
9   9 18  0
10 10  5  1
11 11  7  1
12 12  0  1
13 13  2  0

Regards, Adai


On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote:
 Dear List,
 
 I have two data.frame of the following form:
 
 A:
 
 n  V1 V2
 1  12  0 
 2  10  8
 3   3  8 
 4   8  4
 6   7  3  
 7  12  0 
 8   1  0 
 9  18  0 
 10  1  0
 13  2  0
 
 B:
 
 n  V1 V2
 1   0  2
 2   0  3
 3   1  9
 4  12  8 
 5   2  9
 6   2  9
 8   2  0
 10  4  1
 11  7  1
 12  0  1
 
 
 Now I want to merge those frame to one data.frame with summing up the
 columns V1 and V2 but not the column n. So the result in this example
 would be:
 
 AB:
 
 n  V1 V2
 1  12  2
 2  10 11 
 3   4 17
 4  20 12
 5   2  9
 6   9 12
 7  12  0
 8   3  0
 9  18  0
 10  5  1
 11  7  1
 12  0  1
 13  2  0 
 
 
 So Columns V1 and V2 are the sum of A und B while n has its old value.
 Notice that there are different rows in n of A and B.
 
 I don't have a clue how to start here. Any hint is welcome.
 
 Thanks
 
 Dubravko Dolic
 Munich
 Germany
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging with aggregating

2005-12-06 Thread Marc Schwartz (via MN)
On Tue, 2005-12-06 at 15:19 +0100, Dubravko Dolic wrote:
 Hi all,
 
 the moment you hit the 'send' button you know the answer...
 
 I approached a solution similar to this one given by Marc. But maybe
 there is a better one? Even because this operation is done in a
 for-loop during which R gets new data from a database. So I sum up 16
 data.frames eventually.
 
 Dubro

SNIP


OKso here is one possible approach to a more generic solution:


# Preallocate a list with 16 elements

DF.List - replicate(16, list(numeric(0)))


DF.List looks like:

 head(DF.List)
[[1]]
numeric(0)

[[2]]
numeric(0)

[[3]]
numeric(0)

[[4]]
numeric(0)

...


# Do your loop here, placing the actual results
# of your queries into DF.List[[i]]. I am just using
# random samples here for the example.
# NOTE: I am making the assumption in this example
# that each resultant DF will have the same structure.

for (i in 1:16)
{
  DF.List[[i]] - data.frame(n = sample(20, 10),
 V1 = sample(20, 10),
 V2 = sample(0:10, 10))
}


# Now rbind() the data frames together
DF.All - do.call(rbind, DF.List)

# Now do use aggregate() to get the sums of V1 and V2
# by 'n'.
DF.Sums - aggregate(DF.All[, c(V1, V2)], list(n = DF.All$n), sum)


 DF.Sums
n  V1 V2
1   1 161 65
2   2  86 67
3   3  72 28
4   4  59 31
5   5 101 48
6   6  68 41
7   7  75 34
8   8  73 30
9   9  59 26
10 10  80 16
11 11 127 44
12 12 111 78
13 13 111 38
14 14  69 28
15 15  71 26
16 16  90 51
17 17  50 36
18 18  48 41
19 19  92 38
20 20  71 22


Does that get closer to what you need?

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging with aggregating

2005-12-06 Thread Pierre Kleiber
Here's a solution that uses aggregate(), as suggested in the subject of this 
thread.

  m1 - cbind(  n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2),
   v2=c(0,8,8,4,3,0,0,0,0,0) )
 
  m2 - cbind(  n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0),
   v2=c(2,3,9,8,9,9,0,1,1,1) )
  tt - as.data.frame(rbind(m1,m2))
  aggregate(list(v1=tt$v1,v2=tt$v2),by=list(n=tt$n),sum)
 n v1 v2
1   1 12  2
2   2 10 11
3   3  4 17
4   4 20 12
5   5  2  9
6   6  9 12
7   7 12  0
8   8  3  0
9   9 18  0
10 10  5  1
11 11  7  1
12 12  0  1
13 13  2  0

Cheers, Pierre


Adaikalavan Ramasamy offered the following remark on 12/06/05 04:40...
 m1 - cbind(  n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2),
  v2=c(0,8,8,4,3,0,0,0,0,0) )
 
 m2 - cbind(  n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0),
  v2=c(2,3,9,8,9,9,0,1,1,1) )
 
 m.all - merge(m1, m2, by=n, all=T)
 
   n v1.x v2.x v1.y v2.y
   1   1   12002
   2   2   10803
   3   33819
   4   484   128
   5   5   NA   NA29
   6   67329
   7   7   120   NA   NA
   8   81020
   9   9   180   NA   NA
   10 101041
   11 11   NA   NA71
   12 12   NA   NA01
   13 1320   NA   NA
 
 Then depending on how many such columns there are, you have a number of
 ways of aggregating this dataset. One such way is
 
 cbind( n=m.all[ , n], 
   v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) )  ], na.rm=T ),
   v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) )
 
   n v1 v2
   1   1 12  2
   2   2 10 11
   3   3  4 17
   4   4 20 12
   5   5  2  9
   6   6  9 12
   7   7 12  0
   8   8  3  0
   9   9 18  0
   10 10  5  1
   11 11  7  1
   12 12  0  1
   13 13  2  0
 
 Regards, Adai
 
 
 On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote:
 
Dear List,

I have two data.frame of the following form:

A:

n  V1 V2
1  12  0 
2  10  8
3   3  8 
4   8  4
6   7  3  
7  12  0 
8   1  0 
9  18  0 
10  1  0
13  2  0

B:

n  V1 V2
1   0  2
2   0  3
3   1  9
4  12  8 
5   2  9
6   2  9
8   2  0
10  4  1
11  7  1
12  0  1


Now I want to merge those frame to one data.frame with summing up the
columns V1 and V2 but not the column n. So the result in this example
would be:

AB:

n  V1 V2
1  12  2
2  10 11 
3   4 17
4  20 12
5   2  9
6   9 12
7  12  0
8   3  0
9  18  0
10  5  1
11  7  1
12  0  1
13  2  0 


So Columns V1 and V2 are the sum of A und B while n has its old value.
Notice that there are different rows in n of A and B.

I don't have a clue how to start here. Any hint is welcome.

Thanks

Dubravko Dolic
Munich
Germany

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
-
Pierre Kleiber, Ph.D   Email: [EMAIL PROTECTED]
Fishery BiologistTel: 808 983-5399 / (hm)808 737-7544
NOAA Fisheries Service - Honolulu LaboratoryFax: 808 983-2902
2570 Dole St., Honolulu, HI 96822-2396
-
  God could have told Moses about galaxies and mitochondria and
   all.  But behold... It was good enough for government work.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html