Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael
I suspect that the chisquare-test might not be appropriate, as you have 
constraints (same number of observations for A in both contingency tables). I 
further suspect that there is no test readily available for that, but I might 
be wrong. Maybe randomization tests could help here, but it would require a bit 
of thinking AND programming to accomplish that. chisq.test might give you an 
approximate solution, but I can't say how good this will be (and it might also 
depend on the data, btw).
Best, Michael


From: Michael Haenlein
Sent: Tuesday, September 27, 2011 17:05
To: r-help@r-project.org
Cc: Meyners, Michael
Subject: RE: [R] Pearson chi-square test

Dear Michael,
 
Thanks very much for your answers!
 
The purpose of my analysis is to test whether the contingency table x is 
different from the contingency table y.
Or, to put it differently, whether there is a significant difference between 
the joint distribution A&B and A&C.
 
Based on your answer I'm wondering whether the best way to do this is really a 
chisq.test?
Or is there probably a different function or package I should use altogether?
 
Thanks,
 
Michael
 
 
 
-Original Message-
From: Meyners, Michael 
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help@r-project.org
Subject: RE: [R] Pearson chi-square test
 
Just for completeness: the manual calculation you'd want is most likely
 
sum((x-y)^2  / (x+y))
 
(that's one you can find on the Wikipedia link you provided). To get the same 
from chisq.test, try something like 
 
chisq.test(data.frame(x,y)[,c(3,6)])
 
(there are surely smarter ways, but at least it works here). Note that 
something like 
 
chisq.test(as.vector(x), as.vector(y)) 
 
will give a different test, i.e. based on a contingency table of x cross y).
M. 
 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Meyners, Michael
> Sent: Tuesday, September 27, 2011 13:28
> To: Michael Haenlein; r-help@r-project.org
> Subject: Re: [R] Pearson chi-square test
> 
> Not sure what you want to test here with two matrices, but reading the
> manual helps here as well:
> 
> y   a vector; ignored if x is a matrix.
> 
> x and y are matrices in your example, so it comes as no surprise that
> you get different results. On top of that, your manual calculation is
> not correct if you want to test whether two samples come from the same
> distribution (so don't be surprised if R still gives a different
> value...).
> 
> HTH, Michael
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Michael Haenlein
> > Sent: Tuesday, September 27, 2011 12:45
> > To: r-help@r-project.org
> > Subject: [R] Pearson chi-square test
> >
> > Dear all,
> >
> > I have some trouble understanding the chisq.test function.
> > Take the following example:
> >
> > set.seed(1)
> > A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
> > B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
> > C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
> > x <- table(A,B)
> > y <- table(A,C)
> >
> > When I calculate the test statistic by hand I get a value of
> > approximately
> > 75.9:
> > http://en.wikipedia.org/wiki/Pearson's_chi-
> > square_test#Calculating_the_test-statistic
> > sum((x-y)^2/y)
> >
> > But when I do chisq.test(x,y) I get a value of 12.2 while
> > chisq.test(y,x)
> > gives a value of 10.3.
> >
> > I understand that I must be doing something wrong here, but I'm not
> > sure
> > what.
> >
> > Thanks,
> >
> > Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pearson chi-square test

2011-09-27 Thread Michael Haenlein
Dear Michael,



Thanks very much for your answers!



The purpose of my analysis is to test whether the contingency table x is
different from the contingency table y.

Or, to put it differently, whether there is a significant difference between
the joint distribution A&B and A&C.



Based on your answer I'm wondering whether the best way to do this is really
a chisq.test?

Or is there probably a different function or package I should use
altogether?



Thanks,



Michael







-Original Message-
From: Meyners, Michael [mailto:meyner...@pg.com]
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help@r-project.org
Subject: RE: [R] Pearson chi-square test



Just for completeness: the manual calculation you'd want is most likely



sum((x-y)^2  / (x+y))



(that's one you can find on the Wikipedia link you provided). To get the
same from chisq.test, try something like



chisq.test(data.frame(x,y)[,c(3,6)])



(there are surely smarter ways, but at least it works here). Note that
something like



chisq.test(as.vector(x), as.vector(y))



will give a different test, i.e. based on a contingency table of x cross y).

M.



> -Original Message-

> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-

> project.org] On Behalf Of Meyners, Michael

> Sent: Tuesday, September 27, 2011 13:28

> To: Michael Haenlein; r-help@r-project.org

> Subject: Re: [R] Pearson chi-square test

>

> Not sure what you want to test here with two matrices, but reading the

> manual helps here as well:

>

> y   a vector; ignored if x is a matrix.

>

> x and y are matrices in your example, so it comes as no surprise that

> you get different results. On top of that, your manual calculation is

> not correct if you want to test whether two samples come from the same

> distribution (so don't be surprised if R still gives a different

> value...).

>

> HTH, Michael

>

> > -Original Message-

> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-

> > project.org] On Behalf Of Michael Haenlein

> > Sent: Tuesday, September 27, 2011 12:45

> > To: r-help@r-project.org

> > Subject: [R] Pearson chi-square test

> >

> > Dear all,

> >

> > I have some trouble understanding the chisq.test function.

> > Take the following example:

> >

> > set.seed(1)

> > A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)

> > B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)

> > C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)

> > x <- table(A,B)

> > y <- table(A,C)

> >

> > When I calculate the test statistic by hand I get a value of

> > approximately

> > 75.9:

> > http://en.wikipedia.org/wiki/Pearson's_chi-

> > square_test#Calculating_the_test-statistic

> > sum((x-y)^2/y)

> >

> > But when I do chisq.test(x,y) I get a value of 12.2 while

> > chisq.test(y,x)

> > gives a value of 10.3.

> >

> > I understand that I must be doing something wrong here, but I'm not

> > sure

> > what.

> >

> > Thanks,

> >

> > Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael
Just for completeness: the manual calculation you'd want is most likely

sum((x-y)^2  / (x+y))

(that's one you can find on the Wikipedia link you provided). To get the same 
from chisq.test, try something like 

chisq.test(data.frame(x,y)[,c(3,6)])

(there are surely smarter ways, but at least it works here). Note that 
something like 

chisq.test(as.vector(x), as.vector(y)) 

will give a different test, i.e. based on a contingency table of x cross y).
M. 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Meyners, Michael
> Sent: Tuesday, September 27, 2011 13:28
> To: Michael Haenlein; r-help@r-project.org
> Subject: Re: [R] Pearson chi-square test
> 
> Not sure what you want to test here with two matrices, but reading the
> manual helps here as well:
> 
> y a vector; ignored if x is a matrix.
> 
> x and y are matrices in your example, so it comes as no surprise that
> you get different results. On top of that, your manual calculation is
> not correct if you want to test whether two samples come from the same
> distribution (so don't be surprised if R still gives a different
> value...).
> 
> HTH, Michael
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Michael Haenlein
> > Sent: Tuesday, September 27, 2011 12:45
> > To: r-help@r-project.org
> > Subject: [R] Pearson chi-square test
> >
> > Dear all,
> >
> > I have some trouble understanding the chisq.test function.
> > Take the following example:
> >
> > set.seed(1)
> > A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
> > B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
> > C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
> > x <- table(A,B)
> > y <- table(A,C)
> >
> > When I calculate the test statistic by hand I get a value of
> > approximately
> > 75.9:
> > http://en.wikipedia.org/wiki/Pearson's_chi-
> > square_test#Calculating_the_test-statistic
> > sum((x-y)^2/y)
> >
> > But when I do chisq.test(x,y) I get a value of 12.2 while
> > chisq.test(y,x)
> > gives a value of 10.3.
> >
> > I understand that I must be doing something wrong here, but I'm not
> > sure
> > what.
> >
> > Thanks,
> >
> > Michael
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael
Not sure what you want to test here with two matrices, but reading the manual 
helps here as well: 

y   a vector; ignored if x is a matrix.

x and y are matrices in your example, so it comes as no surprise that you get 
different results. On top of that, your manual calculation is not correct if 
you want to test whether two samples come from the same distribution (so don't 
be surprised if R still gives a different value...).

HTH, Michael

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Michael Haenlein
> Sent: Tuesday, September 27, 2011 12:45
> To: r-help@r-project.org
> Subject: [R] Pearson chi-square test
> 
> Dear all,
> 
> I have some trouble understanding the chisq.test function.
> Take the following example:
> 
> set.seed(1)
> A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
> B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
> C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
> x <- table(A,B)
> y <- table(A,C)
> 
> When I calculate the test statistic by hand I get a value of
> approximately
> 75.9:
> http://en.wikipedia.org/wiki/Pearson's_chi-
> square_test#Calculating_the_test-statistic
> sum((x-y)^2/y)
> 
> But when I do chisq.test(x,y) I get a value of 12.2 while
> chisq.test(y,x)
> gives a value of 10.3.
> 
> I understand that I must be doing something wrong here, but I'm not
> sure
> what.
> 
> Thanks,
> 
> Michael
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pearson chi-square test

2011-09-27 Thread Michael Haenlein
Dear all,

I have some trouble understanding the chisq.test function.
Take the following example:

set.seed(1)
A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
x <- table(A,B)
y <- table(A,C)

When I calculate the test statistic by hand I get a value of approximately
75.9:
http://en.wikipedia.org/wiki/Pearson's_chi-square_test#Calculating_the_test-statistic
sum((x-y)^2/y)

But when I do chisq.test(x,y) I get a value of 12.2 while chisq.test(y,x)
gives a value of 10.3.

I understand that I must be doing something wrong here, but I'm not sure
what.

Thanks,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.