> On 16 Jul 2015, at 15:13 , Ivan Calandra <ivan.calan...@univ-reims.fr> wrote: > > Dear useRs, > > I am running a wilcox.test() on two subsets of a dataset and get exactly the > same results although the raw data are different in the subsets. > > mydata <- structure(list(cat1 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("high", "low"), class = > "factor"), cat2 = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, > 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("large", "small"), class = "factor"), > var1 = c(2.012743, 1.51272, 1.328453, 1.2609935, 1.617757, 1.8175455, > 1.890035, 2.3652205, 1.295888, 1.5985145, 1.081813, 1.856733, 2.366358, > 2.27421, 1.727023, 2.230433, 5.272843, 3.7626355), var2 = c(0.00196, > 0.0066545, 0.006188, 0.0058985, 0.004453, 0.005468, 0.003773, 0.004742, > 0.007525, 0.0081235, 0.004611, 0.0050475, 0.006643, 0.0097335, 0.009213, > 0.0049525, 0.006243, 0.006021)), .Names = c("cat1", "cat2", "var1", "var2"), > row.names = c(NA, 18L), class = "data.frame") > > #p-values are identical but W different for the first variable > wilcox.test(var1~cat1, data=mydata[mydata$cat2=="large",]) > wilcox.test(var1~cat1, data=mydata[mydata$cat2=="small",]) > > #both p-values and W are identical for the second variable > wilcox.test(var2~cat1, data=mydata[mydata$cat2=="large",]) > wilcox.test(var2~cat1, data=mydata[mydata$cat2=="small",]) > > Did I do something wrong or does it just have something to do with my > dataset? Or is it just a coincidence?
Coincidence, mostly, I think: You have > table(mydata[mydata$cat2=="small","cat1"]) high low 4 5 > table(mydata[mydata$cat2=="large","cat1"]) high low 4 5 and all of your response variables' values are distinct. In both cases, the null distribution of the rank sum W is that of (sum(sample(1:9,4))-sum(1:4)) which is a distribution on 0:20, symmetric around 10. Hence there are only 11 different p-values possible, so it is not particularly odd that you may get the same one twice. > > Thank you in advance for your help, > Ivan > > -- > Ivan Calandra, ATER > University of Reims Champagne-Ardenne > GEGENAA - EA 3795 > CREA - 2 esplanade Roland Garros > 51100 Reims, France > +33(0)3 26 77 36 89 > ivan.calan...@univ-reims.fr > https://www.researchgate.net/profile/Ivan_Calandra > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.