[R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of repetitions  
(simulate.p.value) over a fixed sample, the p-value decreases  
dramatically (looks as if converge to zero).


(1) Why?
(2) (If this test is wrong), then which test can check what I want to  
check, that is: are the two distributions of frequencies (observed and  
expected) in principle the same?

(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })


Thank you, Sören


--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread Peter Dalgaard
soeren.vo...@eawag.ch wrote:
 A Likert scale may have produced counts of answers per category.
 According to theory I may expect equality over the categories. A
 statistical test shall reveal the actual equality in my sample.
 
 When applying a chi square test with increasing number of repetitions
 (simulate.p.value) over a fixed sample, the p-value decreases
 dramatically (looks as if converge to zero).
 
 (1) Why?
 (2) (If this test is wrong), then which test can check what I want to
 check, that is: are the two distributions of frequencies (observed and
 expected) in principle the same?
 (3) By the way, how to deal with low frequency cells?
 
 r - c(10, 100, 500, 1000, 2000, 5000)
 v - c(35, 40, 45, 45, 40, 35)
 sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),
 rescale.p=T, simulate.p.value=T, B=x)$p.value })

This is a combination of user error and an infelicity in chisq.test.

You are sapply'ing over a list with one element, so essentially you are
doing

chisq.test(v, p=c(rep.int(40, 6)),
 rescale.p=T, simulate.p.value=T, B=r)$p.value

Now B is supposed to be a single integer, so the above cannot be
expected to do anything sensible, but you might have hoped for an error
message. Instead, it seems that you get the result of r[1] replications
divided by r+1:

 chisq.test(v, p=c(rep.int(40, 6)), rescale.p=T, simulate.p.value=T,
B=r)$p.value
[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720

 7/(r+1)
[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720

What you really wanted was

 sapply(r,function (x) { chisq.test(v, p=c(rep.int(40, 6)),
rescale.p=T, simulate.p.value=T, B=x)$p.value })
[1] 0.9090909 0.8118812 0.7964072 0.7672328 0.8025987 0.7932414



 Thank you, Sören
 
 
 --Sören Vogel, PhD-Student, Eawag, Dept. SIAM
 http://www.eawag.ch, http://sozmod.eawag.ch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread David Winsemius


On Mar 11, 2009, at 6:36 AM, soeren.vo...@eawag.ch wrote:

A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of  
repetitions (simulate.p.value) over a fixed sample, the p-value  
decreases dramatically (looks as if converge to zero).


(1) Why?


With low numbers of repetitions the test has low power, i.e, it may  
give you the wrong answer to the question: are those two vectors from  
the same distribution? As you increase in number, the simulated value  
approaches the truth.


(2) (If this test is wrong), then which test can check what I want  
to check, that is: are the two distributions of frequencies  
(observed and expected) in principle the same?


In principle they are not the same. Do you want a test that tells  
you they are?


(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })


Thank you, Sören


--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread David Winsemius
Thanks to Peter Dalgaard for the correct answer. I misinterpreted what  
R was returning.



On Mar 11, 2009, at 7:32 AM, David Winsemius wrote:



On Mar 11, 2009, at 6:36 AM, soeren.vo...@eawag.ch wrote:

A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of  
repetitions (simulate.p.value) over a fixed sample, the p-value  
decreases dramatically (looks as if converge to zero).


(1) Why?


With low numbers of repetitions the test has low power, i.e, it may  
give you the wrong answer to the question: are those two vectors  
from the same distribution? As you increase in number, the simulated  
value approaches the truth.


(2) (If this test is wrong), then which test can check what I want  
to check, that is: are the two distributions of frequencies  
(observed and expected) in principle the same?


In principle they are not the same. Do you want a test that tells  
you they are?


(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
Thanks to Peter, David, and Michael! After having corrected the coding  
error, the p values converge to particular value, not necessarily  
zero. The whole story is, 634 respondents in 6 different areas marked  
their answer on a 7-step Likert scale (very bad, bad, ..., very good  
-- later recoded to 5 scale levels). The statistical question now is,  
do the answer's distributions (amount of goods, bads etc.) in either  
area differ from the mean answer-distribution calculated with  
summing up all goods, bads, etc. Anyway an omnibus chi square would  
not answer my question, and due to spurious significances I'd rather  
go back to my chi square book ;-) (for the interested, see http://sozmod.eawag.ch/files/file.Robj 
 for the entire table).


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.