[R] sum a particular column by group
Dear all, I have a table like this: eds R.ID Region Gender Agegr Time nvisits 11 A F 60--64 1:00 1 22 OF 55--591:20 1 33 OF 55--59 3:45 3 44 SM 60--641:10 3 55 W F 55--59 12:30 1 66 W M 60--64 8:00 2 I got a bootstrap sample using the following code: r-sample(eds[,1],replace=TRUE) r [1] 2 4 3 2 6 4 beds-eds[r,] beds R.ID Region Gender Agegr Time nvisits 2 2 O F 55--59 1:20 1 4 4 S M60--64 1:10 3 3 3 O F 55--59 3:45 3 2.12 O F 55--59 1:20 1 6 6 WM 60--64 8:00 2 4.14 SM 60--64 1:10 3 I want to sum the last column by columns 2,3,and 4(including 0 in some group). I tried the following codes: #1 : only get the freq, not the sum of the last column. table-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4]))) table Var1 Var2 Var3 Freq 1 AF 55--590 2 OF 55--593 3 SF 55--590 4 WF 55--590 5 AM 55--590 6 OM 55--590 7 SM 55--590 8 WM 55--590 9 AF 60--640 10OF 60--640 11SF 60--640 12WF 60--640 13AM 60--640 14OM 60--640 15SM 60--642 16WM 60--641 # 2: only got the sum the last column, but miss the group with 0 counts. aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum) Group.1 Group.2 Group.3 x 1 O F 55--59 5 2 S M 60--64 6 3 W M 60--64 2 In conclusion, the following is what I want: Var1 Var2 Var3 Freq 1 AF 55--590 2 OF 55--595 3 SF 55--590 4 WF 55--590 5 AM 55--590 6 OM 55--590 7 SM 55--590 8 WM 55--590 9 AF 60--640 10OF 60--640 11SF 60--640 12WF 60--640 13AM 60--640 14OM 60--640 15SM 60--646 16WM 60--642 Does anyone know a code to do this or give a hint? Thank you in advance. Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum a particular column by group
Thanks for your help. Finally, I got it. From: Dennis Murphy [mailto:djmu...@gmail.com] Sent: Friday, February 05, 2010 12:20 PM To: Fang (Betty) Yang Cc: r-help@r-project.org Subject: Re: [R] sum a particular column by group Hi: This is not an elegant solution by any means, but it gets what you want...using the data frame from your bootstrap sample, # All combinations of the three factors xx - with(beds, expand.grid(Region = levels(Region), Gender = levels(Gender), Agegr = levels(Agegr)) ) dim(xx) [1] 12 3# differs from the 16, but bootstrapping probably explains it... # One way to get a summary (there are others...) library(plyr) yy - ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits = sum(nvisits)) res - merge(xx, yy, all.x = TRUE) res - within(res, Nvisits[is.na(Nvisits)] - 0) res Region Gender Agegr Nvisits 1 O F 55--59 5 2 O F 60--64 0 3 O M 55--59 0 4 O M 60--64 0 5 S F 55--59 0 6 S F 60--64 0 7 S M 55--59 0 8 S M 60--64 6 9 W F 55--59 0 10 W F 60--64 0 11 W M 55--59 0 12 W M 60--64 2 HTH, Dennis On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang fang.y...@ualberta.ca wrote: Dear all, I have a table like this: eds R.ID Region Gender Agegr Time nvisits 11 A F 60--64 1:00 1 22 OF 55--591:20 1 33 OF 55--59 3:45 3 44 SM 60--641:10 3 55 W F 55--59 12:30 1 66 W M 60--64 8:00 2 I got a bootstrap sample using the following code: r-sample(eds[,1],replace=TRUE) r [1] 2 4 3 2 6 4 beds-eds[r,] beds R.ID Region Gender Agegr Time nvisits 2 2 O F 55--59 1:20 1 4 4 S M60--64 1:10 3 3 3 O F 55--59 3:45 3 2.12 O F 55--59 1:20 1 6 6 WM 60--64 8:00 2 4.14 SM 60--64 1:10 3 I want to sum the last column by columns 2,3,and 4(including 0 in some group). I tried the following codes: #1 : only get the freq, not the sum of the last column. table-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4]))) table Var1 Var2 Var3 Freq 1 AF 55--590 2 OF 55--593 3 SF 55--590 4 WF 55--590 5 AM 55--590 6 OM 55--590 7 SM 55--590 8 WM 55--590 9 AF 60--640 10OF 60--640 11SF 60--640 12WF 60--640 13AM 60--640 14OM 60--640 15SM 60--642 16WM 60--641 # 2: only got the sum the last column, but miss the group with 0 counts. aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum) Group.1 Group.2 Group.3 x 1 O F 55--59 5 2 S M 60--64 6 3 W M 60--64 2 In conclusion, the following is what I want: Var1 Var2 Var3 Freq 1 AF 55--590 2 OF 55--595 3 SF 55--590 4 WF 55--590 5 AM 55--590 6 OM 55--590 7 SM 55--590 8 WM 55--590 9 AF 60--640 10OF 60--640 11SF 60--640 12WF 60--640 13AM 60--640 14OM 60--640 15SM 60--646 16WM 60--642 Does anyone know a code to do this or give a hint? Thank you in advance. Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] keep empty subsets using aggregate
Dear all, I am struggling with a small problem. By using aggregate, the empty subsets are removed. I need each empty subset to be 0. Any suggestions will be appreciated. Code: edref = aggregate(rep(1,times=dim(eds)[1]),list(eds[,11], eds[,7], eds[,27]), sum) Thanks in advance, Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with read.csv
Dear all, I'd like to ask help on R code to get the same results as the following Splus code: indata-importData(/home/data_new.csv) indata[1:5,4] [1] 0930 1601 1006 1032 1020 I tried the following R code: indata-read.csv(/home/data_new.csv) indata[1:5,4] [1] 930 1601 1006 1032 1020 I'd like the first one to be 0930, too. Thanks in advance, Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extract day or month as in Splus
Dear all, I am writing to ask for help to find R code to do the same thing as the following Splus code: dates - c(02/27/1992, 02/27/1992, 01/14/1992, 02/28/1992, 02/01/1992) timeDate(as.character(dates),in.format=%m/%d/%Y,%a) [1] Thu Thu Tue Fri Sat Could anyone give me some R codes to get the same results as above(extract days from dates), please? Thanks in advance! Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.