Re: [R] categorical data
Hello, thank you very much, it works super! Christian Hi On 10 Aug 2006 at 9:19, Christian Oswald wrote: Date sent: Thu, 10 Aug 2006 09:19:06 +0200 From: Christian Oswald <[EMAIL PROTECTED]> To: r-help@stat.math.ethz.ch Subject: Re: [R] categorical data Send reply to: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> <mailto:[EMAIL PROTECTED]> > Hello, > > thats what I need, a list sorted first after year and then after > categorie. But I get an error message > > > df > df cate bc > [1,] "2006" "a1" "1" "1" > [2,] "2006" "a2" "2" "2" > [3,] "2005" "a1" "3" "3" > [4,] "2004" "a3" "1" "1" > [5,] "2004" "a2" "2" "2" > [6,] "2005" "a1" "3" "3" > [7,] "2003" "a2" "11" "11" > [8,] "2003" "a1" "2" "2" > [9,] "2006" "a2" "3" "3" This is not a data frame but character matrix try str(df). It was probably constructed by cbind(...), try to use data.frame() instead. Or you can try as.data.frame(df) but then you need to change resulting factors back to numeric ?as.character ?as.numeric HTH Petr try > > res<-aggregate( df[,c(3,4)], list(df$year,df$cate), sum) > Fehler in as.vector(x, mode) : Argument hat ungültigen 'mode' > > > (Error in as.vector(x,mode) :Argument has invalid mode) > > I had tested the mode and receive "character". Can someone explain > what thats mean? > > Christian > > > > On Wed, 2006-08-09 at 18:07 +0200, Christian Oswald wrote: > > > Dear List, > > > > > > I neeed a grouped list with two sort of categorical data. I have a > > > data .frame like this.yearcat.b c 1 2006a1 > > > 125 212 > > > 2 2006a2 256 212 3 2005a1 14 12 4 > > > 2004a3 565 123 > > > 5 2004a2 156 789 6 2005a1 1 456 7 > > > 2003a2 786 123 > > > 8 2003a1 421 569 9 2002a2 425 245 > > > > > > I need a list with the sum of b and c for every year and every cat > > > (a1, a2 or a3) in this year. I had used the tapply function to > > > build the sum for every year or every cat. How can I combine the > > > two grouping values? > > Christian, > > Is that what you want (using DF as your data.frame): > > > > aggregate(DF[, c("b", "c")], > by = list(Year = DF$year, Cat = DF$cat.), > sum) > Year Cat b c > 1 2003 a1 421 569 > 2 2005 a1 15 468 > 3 2006 a1 125 212 > 4 2002 a2 425 245 > 5 2003 a2 786 123 > 6 2004 a2 156 789 > 7 2006 a2 256 212 > 8 2004 a3 565 123 > > You can also reorder the results by Year and Cat: > > > > DF.result <- aggregate(DF[, c("b", "c")], > by = list(Year = DFyear, Cat = DF$cat.), sum) > > > > DF.result[order(DF.result$Year, DF.result$Cat), ] > Year Cat b c > 4 2002 a2 425 245 > 1 2003 a1 421 569 > 5 2003 a2 786 123 > 6 2004 a2 156 789 > 8 2004 a3 565 123 > 2 2005 a1 15 468 > 3 2006 a1 125 212 > 7 2006 a2 256 212 > > > > Note that tapply() can only handle one 'X' vector at a time, whereas > aggregate can handle multiple 'X' columns in one call. For example: > > > > tapply(DF$b, list(DF$year, DF$cat.), sum) > a1 a2 a3 > 2002 NA 425 NA > 2003 421 786 NA > 2004 NA 156 565 > 2005 15 NA NA > 2006 125 256 NA > > will give you the sum of 'b' for each combination of Year and Cat > within the 2d table, but I suspect this is not the output format you > want. You also get NA's in the cells where there was not the given > combination present in your data. > > HTH, > > Marc Schwartz > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code. > > > > aw.de > > __ > R-help@stat.math.ethz.ch mailing list > https://sta
Re: [R] categorical data
Hi On 10 Aug 2006 at 9:19, Christian Oswald wrote: Date sent: Thu, 10 Aug 2006 09:19:06 +0200 From: Christian Oswald <[EMAIL PROTECTED]> To: r-help@stat.math.ethz.ch Subject: Re: [R] categorical data Send reply to: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> <mailto:[EMAIL PROTECTED]> > Hello, > > thats what I need, a list sorted first after year and then after > categorie. But I get an error message > > > df > df cate bc > [1,] "2006" "a1" "1" "1" > [2,] "2006" "a2" "2" "2" > [3,] "2005" "a1" "3" "3" > [4,] "2004" "a3" "1" "1" > [5,] "2004" "a2" "2" "2" > [6,] "2005" "a1" "3" "3" > [7,] "2003" "a2" "11" "11" > [8,] "2003" "a1" "2" "2" > [9,] "2006" "a2" "3" "3" This is not a data frame but character matrix try str(df). It was probably constructed by cbind(...), try to use data.frame() instead. Or you can try as.data.frame(df) but then you need to change resulting factors back to numeric ?as.character ?as.numeric HTH Petr try > > res<-aggregate( df[,c(3,4)], list(df$year,df$cate), sum) > Fehler in as.vector(x, mode) : Argument hat ungültigen 'mode' > > > (Error in as.vector(x,mode) :Argument has invalid mode) > > I had tested the mode and receive "character". Can someone explain > what thats mean? > > Christian > > > > On Wed, 2006-08-09 at 18:07 +0200, Christian Oswald wrote: > > > Dear List, > > > > > > I neeed a grouped list with two sort of categorical data. I have a > > > data .frame like this.yearcat.b c 1 2006a1 > > > 125 212 > > > 2 2006a2 256 212 3 2005a1 14 12 4 > > > 2004a3 565 123 > > > 5 2004a2 156 789 6 2005a1 1 456 7 > > > 2003a2 786 123 > > > 8 2003a1 421 569 9 2002a2 425 245 > > > > > > I need a list with the sum of b and c for every year and every cat > > > (a1, a2 or a3) in this year. I had used the tapply function to > > > build the sum for every year or every cat. How can I combine the > > > two grouping values? > > Christian, > > Is that what you want (using DF as your data.frame): > > > > aggregate(DF[, c("b", "c")], > by = list(Year = DF$year, Cat = DF$cat.), > sum) > Year Cat b c > 1 2003 a1 421 569 > 2 2005 a1 15 468 > 3 2006 a1 125 212 > 4 2002 a2 425 245 > 5 2003 a2 786 123 > 6 2004 a2 156 789 > 7 2006 a2 256 212 > 8 2004 a3 565 123 > > You can also reorder the results by Year and Cat: > > > > DF.result <- aggregate(DF[, c("b", "c")], > by = list(Year = DFyear, Cat = DF$cat.), sum) > > > > DF.result[order(DF.result$Year, DF.result$Cat), ] > Year Cat b c > 4 2002 a2 425 245 > 1 2003 a1 421 569 > 5 2003 a2 786 123 > 6 2004 a2 156 789 > 8 2004 a3 565 123 > 2 2005 a1 15 468 > 3 2006 a1 125 212 > 7 2006 a2 256 212 > > > > Note that tapply() can only handle one 'X' vector at a time, whereas > aggregate can handle multiple 'X' columns in one call. For example: > > > > tapply(DF$b, list(DF$year, DF$cat.), sum) > a1 a2 a3 > 2002 NA 425 NA > 2003 421 786 NA > 2004 NA 156 565 > 2005 15 NA NA > 2006 125 256 NA > > will give you the sum of 'b' for each combination of Year and Cat > within the 2d table, but I suspect this is not the output format you > want. You also get NA's in the cells where there was not the given > combination present in your data. > > HTH, > > Marc Schwartz > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code. > > > > aw.de > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] categorical data
Hello, thats what I need, a list sorted first after year and then after categorie. But I get an error message > df df cate bc [1,] "2006" "a1" "1" "1" [2,] "2006" "a2" "2" "2" [3,] "2005" "a1" "3" "3" [4,] "2004" "a3" "1" "1" [5,] "2004" "a2" "2" "2" [6,] "2005" "a1" "3" "3" [7,] "2003" "a2" "11" "11" [8,] "2003" "a1" "2" "2" [9,] "2006" "a2" "3" "3" > res<-aggregate( df[,c(3,4)], list(df$year,df$cate), sum) Fehler in as.vector(x, mode) : Argument hat ungültigen 'mode' (Error in as.vector(x,mode) :Argument has invalid mode) I had tested the mode and receive "character". Can someone explain what thats mean? Christian On Wed, 2006-08-09 at 18:07 +0200, Christian Oswald wrote: > > Dear List, > > > > I neeed a grouped list with two sort of categorical data. I have a data > > .frame like this. > > yearcat.b c > > 1 2006a1 125 212 > > 2 2006a2 256 212 > > 3 2005a1 14 12 > > 4 2004a3 565 123 > > 5 2004a2 156 789 > > 6 2005a1 1 456 > > 7 2003a2 786 123 > > 8 2003a1 421 569 > > 9 2002a2 425 245 > > > > I need a list with the sum of b and c for every year and every cat (a1, > > a2 or a3) in this year. I had used the tapply function to build the sum > > for every year or every cat. How can I combine the two grouping values? Christian, Is that what you want (using DF as your data.frame): > > aggregate(DF[, c("b", "c")], by = list(Year = DF$year, Cat = DF$cat.), sum) Year Cat b c 1 2003 a1 421 569 2 2005 a1 15 468 3 2006 a1 125 212 4 2002 a2 425 245 5 2003 a2 786 123 6 2004 a2 156 789 7 2006 a2 256 212 8 2004 a3 565 123 You can also reorder the results by Year and Cat: > > DF.result <- aggregate(DF[, c("b", "c")], by = list(Year = DFyear, Cat = DF$cat.), sum) > > DF.result[order(DF.result$Year, DF.result$Cat), ] Year Cat b c 4 2002 a2 425 245 1 2003 a1 421 569 5 2003 a2 786 123 6 2004 a2 156 789 8 2004 a3 565 123 2 2005 a1 15 468 3 2006 a1 125 212 7 2006 a2 256 212 Note that tapply() can only handle one 'X' vector at a time, whereas aggregate can handle multiple 'X' columns in one call. For example: > > tapply(DF$b, list(DF$year, DF$cat.), sum) a1 a2 a3 2002 NA 425 NA 2003 421 786 NA 2004 NA 156 565 2005 15 NA NA 2006 125 256 NA will give you the sum of 'b' for each combination of Year and Cat within the 2d table, but I suspect this is not the output format you want. You also get NA's in the cells where there was not the given combination present in your data. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. aw.de __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] categorical data
On Wed, 2006-08-09 at 18:07 +0200, Christian Oswald wrote: > Dear List, > > I neeed a grouped list with two sort of categorical data. I have a data > .frame like this. > yearcat.b c > 1 2006a1 125 212 > 2 2006a2 256 212 > 3 2005a1 14 12 > 4 2004a3 565 123 > 5 2004a2 156 789 > 6 2005a1 1 456 > 7 2003a2 786 123 > 8 2003a1 421 569 > 9 2002a2 425 245 > > I need a list with the sum of b and c for every year and every cat (a1, > a2 or a3) in this year. I had used the tapply function to build the sum > for every year or every cat. How can I combine the two grouping values? Christian, Is that what you want (using DF as your data.frame): > aggregate(DF[, c("b", "c")], by = list(Year = DF$year, Cat = DF$cat.), sum) Year Cat b c 1 2003 a1 421 569 2 2005 a1 15 468 3 2006 a1 125 212 4 2002 a2 425 245 5 2003 a2 786 123 6 2004 a2 156 789 7 2006 a2 256 212 8 2004 a3 565 123 You can also reorder the results by Year and Cat: > DF.result <- aggregate(DF[, c("b", "c")], by = list(Year = DFyear, Cat = DF$cat.), sum) > DF.result[order(DF.result$Year, DF.result$Cat), ] Year Cat b c 4 2002 a2 425 245 1 2003 a1 421 569 5 2003 a2 786 123 6 2004 a2 156 789 8 2004 a3 565 123 2 2005 a1 15 468 3 2006 a1 125 212 7 2006 a2 256 212 Note that tapply() can only handle one 'X' vector at a time, whereas aggregate can handle multiple 'X' columns in one call. For example: > tapply(DF$b, list(DF$year, DF$cat.), sum) a1 a2 a3 2002 NA 425 NA 2003 421 786 NA 2004 NA 156 565 2005 15 NA NA 2006 125 256 NA will give you the sum of 'b' for each combination of Year and Cat within the 2d table, but I suspect this is not the output format you want. You also get NA's in the cells where there was not the given combination present in your data. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] categorical data
Dear List, I neeed a grouped list with two sort of categorical data. I have a data .frame like this. yearcat.b c 1 2006a1 125 212 2 2006a2 256 212 3 2005a1 14 12 4 2004a3 565 123 5 2004a2 156 789 6 2005a1 1 456 7 2003a2 786 123 8 2003a1 421 569 9 2002a2 425 245 I need a list with the sum of b and c for every year and every cat (a1, a2 or a3) in this year. I had used the tapply function to build the sum for every year or every cat. How can I combine the two grouping values? Thanks, Christian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Categorical data (Monte Carlo exact p-value)
On 26 Mar 2003 at 22:25, Jorge Magalhães wrote: Does ?mantelhaen.test do what you want? (it is in package ctest). Kjetil Halvorsen > In R program, I can perform categorical data test analysis like Odds ratio > test in stratified 2x2 contingency tables? I do that in statistical package > StatXact, but i would like perform the same test in R environment. > > Thanks very much > > Jorge Magalhães > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Categorical data (Monte Carlo exact p-value)
A search in the standard R documentation for "Fisher's exact test" reveals a command "fisher.test", and a search for "Chi-square" reveals a "chisq.test". Spencer Graves Jorge Magalhães wrote: In R program, I can perform categorical data test analysis like Odds ratio test in stratified 2x2 contingency tables? I do that in statistical package StatXact, but i would like perform the same test in R environment. Thanks very much Jorge Magalhães __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Categorical data (Monte Carlo exact p-value)
In R program, I can perform categorical data test analysis like Odds ratio test in stratified 2x2 contingency tables? I do that in statistical package StatXact, but i would like perform the same test in R environment. Thanks very much Jorge Magalhães __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help