Re: [R] Group by in R
On Mon, Apr 13, 2009 at 8:56 AM, Nick Angelou wrote: > >> data > X1 X2 X3 X4 > 1 1 2 2 1 > 2 1 1 2 2 > 3 1 1 2 2 > 4 2 2 1 2 > 5 1 1 2 2 > 6 2 2 1 2 > 7 1 1 2 1 > 8 2 2 1 2 > 9 1 2 1 1 > 10 1 1 2 2 > > sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4 > ORDER BY X4, X1, X2, X3") > > X1 X2 X3 X4 CNT > 1 1 1 2 1 1 > 2 1 2 1 1 1 > 3 1 2 2 1 1 > 4 1 1 2 2 4 > 5 2 2 1 2 3 > > The counts are fine, though it's not exactly what I need. I need a kind of > contingency table: > > | levels of X4 | > --- > unique triplets of X1:X3 | 1 | 2 | > > - > 1 1 1 | 0 0 > 1 1 2 | 1 4 > 1 2 1 | 1 0 > 1 2 2 | 1 0 > 2 1 1 | 0 0 > 2 1 2 | 0 0 > 2 2 1 | 0 3 > 2 2 2 | 0 0 > > > So the final result should be a table structure like: > > > 0 0 > 1 4 > 1 0 > 1 0 > 0 0 > 0 0 > 0 3 > 0 0 > I propose this way to get the numbers you want. I create a new variable to represent the values of the three then make a table: md <- matrix(c(1,2,2,1,1,1,2,2,1,1,2,2,2,2,1,2,1,1,2,2,2,2,1,2,1,1,2,1,2,2,1,2,1,2,1,1,1,1,2,2),ncol=4) dat <- as.data.frame(md) names(dat)<- c("x1","x2","x3","x4") newvar <- factor(paste(dat$x1,dat$x2,dat$x3,sep="-")) table(newvar, dat$x4) Behold: > table(newvar, dat$x4) newvar 1 2 1-1-1 1 0 1-2-1 1 0 1-2-2 1 3 2-1-1 1 0 2-1-2 1 0 2-2-1 1 0 2-2-2 0 1 -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
Assuming DF is your data frame try this: ftable(DF) In SQL you can get close with: sqldf("select X1, X2, X3, sum(X4 == 1) `X4=1`, sum(X4 == 2) `X4=2` from DF group by X1, X2, X3 order by X1, X2, X3") On Mon, Apr 13, 2009 at 9:56 AM, Nick Angelou wrote: > > > Gabor Grothendieck wrote: >> >> SQL has the order by clause. >> > > Gabor, thanks for the suggestion. I thought about this but ORDER BY cannot > create the tabular structure that I need. Here is more detail about my > setting: > > f1, f2, f3 have unique triplets (each repeating a different number of > times). Each of these triplets falls into one of the two categories of f4. > Here is a sample: > >> data > X1 X2 X3 X4 > 1 1 2 2 1 > 2 1 1 2 2 > 3 1 1 2 2 > 4 2 2 1 2 > 5 1 1 2 2 > 6 2 2 1 2 > 7 1 1 2 1 > 8 2 2 1 2 > 9 1 2 1 1 > 10 1 1 2 2 > > sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4 > ORDER BY X4, X1, X2, X3") > > X1 X2 X3 X4 CNT > 1 1 1 2 1 1 > 2 1 2 1 1 1 > 3 1 2 2 1 1 > 4 1 1 2 2 4 > 5 2 2 1 2 3 > > The counts are fine, though it's not exactly what I need. I need a kind of > contingency table: > > | levels of X4 | > --- > unique triplets of X1:X3 | 1 | 2 | > > - > 1 1 1 | 0 0 > 1 1 2 | 1 4 > 1 2 1 | 1 0 > 1 2 2 | 1 0 > 2 1 1 | 0 0 > 2 1 2 | 0 0 > 2 2 1 | 0 3 > 2 2 2 | 0 0 > > > So the final result should be a table structure like: > > > 0 0 > 1 4 > 1 0 > 1 0 > 0 0 > 0 0 > 0 3 > 0 0 > > I guess I could probably do this in SQL with a combination of OUTER JOINs > but I thought > that R might have a more elegant solution based on "factor" and "table". > > Thanks, > Nick > -- > View this message in context: > http://www.nabble.com/Group-by-in-R-tp23020587p23022717.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
Gabor Grothendieck wrote: > > SQL has the order by clause. > Gabor, thanks for the suggestion. I thought about this but ORDER BY cannot create the tabular structure that I need. Here is more detail about my setting: f1, f2, f3 have unique triplets (each repeating a different number of times). Each of these triplets falls into one of the two categories of f4. Here is a sample: > data X1 X2 X3 X4 1 1 2 2 1 2 1 1 2 2 3 1 1 2 2 4 2 2 1 2 5 1 1 2 2 6 2 2 1 2 7 1 1 2 1 8 2 2 1 2 9 1 2 1 1 10 1 1 2 2 sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4 ORDER BY X4, X1, X2, X3") X1 X2 X3 X4 CNT 1 1 1 2 1 1 2 1 2 1 1 1 3 1 2 2 1 1 4 1 1 2 2 4 5 2 2 1 2 3 The counts are fine, though it's not exactly what I need. I need a kind of contingency table: | levels of X4 | --- unique triplets of X1:X3 | 1 | 2 | - 1 1 1 | 0 0 1 1 2 | 1 4 1 2 1 | 1 0 1 2 2 | 1 0 2 1 1 | 0 0 2 1 2 | 0 0 2 2 1 | 0 3 2 2 2 | 0 0 So the final result should be a table structure like: 0 0 1 4 1 0 1 0 0 0 0 0 0 3 0 0 I guess I could probably do this in SQL with a combination of OUTER JOINs but I thought that R might have a more elegant solution based on "factor" and "table". Thanks, Nick -- View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23022717.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
On Apr 13, 2009, at 7:26 AM, Nick Angelou wrote: Thanks a lot, guys. Gabor's and Mike's suggestion worked. Duncan's did not do exactly what I expected (I guess it's the "paste" in Mike's that makes "table" work as I needed it). One more question - is there a convenient way to order the group by results as follows: As rows: the unique combinations of factors f1, f2, f3, as columns the unique values of f4. The counts are basically the same as of the GROUP BY statement (or the paste and table combination suggested by Mike). Only the way the result is structured is different. Tables are actually specific instances of matrices, so if you want to transpose the default arrangement of columns and rows in a table named tab1, try: ?t t(tab1) David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
SQL has the order by clause. On Mon, Apr 13, 2009 at 7:26 AM, Nick Angelou wrote: > > Thanks a lot, guys. Gabor's and Mike's suggestion worked. Duncan's did not do > exactly what I expected (I guess it's the "paste" in Mike's that makes > "table" work as I needed it). > > One more question - is there a convenient way to order the group by results > as follows: > > As rows: the unique combinations of factors f1, f2, f3, as columns the > unique values of f4. The counts are basically the same as of the GROUP BY > statement (or the paste and table combination suggested by Mike). Only the > way the result is structured is different. > > Thanks, > Nick > > > Nick Angelou wrote: >> >> Hi, >> >> I have the following table data: >> >> f1, f2, f3, f4. >> >> I want to compute the counts of unique combinations of f1-f4. In SQL I >> would just write: >> >> SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. >> >> How to do this in R? >> >> Thanks, >> >> Nick >> > > -- > View this message in context: > http://www.nabble.com/Group-by-in-R-tp23020587p23020963.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
Thanks a lot, guys. Gabor's and Mike's suggestion worked. Duncan's did not do exactly what I expected (I guess it's the "paste" in Mike's that makes "table" work as I needed it). One more question - is there a convenient way to order the group by results as follows: As rows: the unique combinations of factors f1, f2, f3, as columns the unique values of f4. The counts are basically the same as of the GROUP BY statement (or the paste and table combination suggested by Mike). Only the way the result is structured is different. Thanks, Nick Nick Angelou wrote: > > Hi, > > I have the following table data: > > f1, f2, f3, f4. > > I want to compute the counts of unique combinations of f1-f4. In SQL I > would just write: > > SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. > > How to do this in R? > > Thanks, > > Nick > -- View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23020963.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
Mike Lawrence wrote: One way: g= paste(f1,f2,f3,f4) table(g) I'd go for g <- interaction(f1,f2,f3,f4, drop=TRUE) table(g) which is essentially the same thing. On Mon, Apr 13, 2009 at 7:33 AM, Nick Angelou wrote: Hi, I have the following table data: f1, f2, f3, f4. I want to compute the counts of unique combinations of f1-f4. In SQL I would just write: SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. How to do this in R? Thanks, Nick -- View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23020587.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
One way: g= paste(f1,f2,f3,f4) table(g) On Mon, Apr 13, 2009 at 7:33 AM, Nick Angelou wrote: > > Hi, > > I have the following table data: > > f1, f2, f3, f4. > > I want to compute the counts of unique combinations of f1-f4. In SQL I would > just write: > > SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. > > How to do this in R? > > Thanks, > > Nick > -- > View this message in context: > http://www.nabble.com/Group-by-in-R-tp23020587p23020587.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
You can use SQL commands directly on R data frames with the R sqldf package: See home page: http://sqldf.googlecode.com On Mon, Apr 13, 2009 at 6:33 AM, Nick Angelou wrote: > > Hi, > > I have the following table data: > > f1, f2, f3, f4. > > I want to compute the counts of unique combinations of f1-f4. In SQL I would > just write: > > SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. > > How to do this in R? > > Thanks, > > Nick > -- > View this message in context: > http://www.nabble.com/Group-by-in-R-tp23020587p23020587.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
Nick Angelou wrote: Hi, I have the following table data: f1, f2, f3, f4. I want to compute the counts of unique combinations of f1-f4. In SQL I would just write: SELECT COUNT(*) FROM GROUP BY f1, f2, ..,f4. How to do this in R? table(f1,f2,f3,f4) will give you the counts. Other statistics can by computed by group using the by() function. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.