Converting to factors does not get all combinations. > v3mean <- aggregate(V3~V1+V2, dat, mean) > cats <- with(dat, expand.grid(V1=unique(V1), V2=unique(V2))) > merge(cats, v3mean, all=TRUE) V1 V2 V3 1 C 0 0.5000000 2 C 1 NA 3 G 0 1.0000000 4 G 1 NA 5 I 0 0.3333333 6 I 1 0.4285714 7 O 0 1.0000000 8 O 1 0.0000000 9 R 0 0.0000000 10 R 1 0.6666667 11 T 0 0.8333333 12 T 1 0.5000000
But the OP's dat1 contains only 6 observations. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Sarah Goslee > Sent: Thursday, December 06, 2012 2:04 PM > To: Christofer Bogaso > Cc: r-help > Subject: Re: [R] Can somebody help me with following data manipulation? > > If I understand what you want correctly, aggregate() should do it. > > > aggregate(V3 ~ V1 + V2, "mean", data=dat) > V1 V2 V3 > 1 C 0 0.5000000 > 2 G 0 1.0000000 > 3 I 0 0.3333333 > 4 O 0 1.0000000 > 5 R 0 0.0000000 > 6 T 0 0.8333333 > 7 I 1 0.4285714 > 8 O 1 0.0000000 > 9 R 1 0.6666667 > 10 T 1 0.5000000 > > That returns the combinations that actually exist. > > If you convert V1 and V2 to factors, thus setting the possible levels, > all combinations will be returned: > > dat$V1 <- factor(dat$V1) > > dat$V2 <- factor(dat$V2) > > aggregate(V3 ~ V1 + V2, "mean", data=dat) > V1 V2 V3 > 1 C 0 0.5000000 > 2 G 0 1.0000000 > 3 I 0 0.3333333 > 4 O 0 1.0000000 > 5 R 0 0.0000000 > 6 T 0 0.8333333 > 7 I 1 0.4285714 > 8 O 1 0.0000000 > 9 R 1 0.6666667 > 10 T 1 0.5000000 > > Sarah > > On Thu, Dec 6, 2012 at 2:35 PM, Christofer Bogaso > <bogaso.christo...@gmail.com> wrote: > > Dear all, let say I have following data: > > > > dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, > 6L, > > 4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L, > > 3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C", > > "G", "I", "O", "R", "T"), class = "factor"), V2 = c(0L, 0L, 0L, > > 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, > > 1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, > > 0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, > > 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, > > 0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2", "V3"), class = > > "data.frame", row.names = c(NA, > > -36L)) > > > > Now I want to get following kind of data frame out of that: > > > > dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), > .Label = > > c("C", > > "G", "I"), class = "factor"), V2 = c(0L, 1L, 0L, 1L, 0L, 1L), > > V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names = > c("V1", > > "V2", "V3"), class = "data.frame", row.names = c(NA, -6L)) > > > > Basically in 'dat1', the 3rd column is coming from: for 'V1 = I' & > 'V2 = 0' > > what is the percentage of '1' for "V3" and so on..... > > > > Is there any R function to achieve that directly? > > > > Thanks and regards, > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.