On Sep 20, 2012, at 02:43 , Thomas Lumley wrote: > On Thu, Sep 20, 2012 at 5:46 AM, Mohamed Radhouane Aniba > <arad...@gmail.com> wrote: >> Hello All, >> >> I am writing to ask your opinion on how to interpret this case. I have two >> vectors "a" and "b" that I am trying to compare. >> >> The wilcoxon test is giving me a pvalue of 5.139217e-303 of a over b with >> the alternative "greater". Now if I make a summary on each of them I have >> the following >> >>> summary(a) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 0.0000000 0.0001411 0.0002381 0.0002671 0.0003623 0.0012910 >>> summary(c) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 0.0000000 0.0000000 0.0000000 0.0004947 0.0002972 1.0000000 >> >> The mean ratio is then around 0.5399031 which naively goes in opposite >> direction of the wilcoxon test ( I was expecting to find a ratio >> 1) >> > > There's nothing conceptually strange about the Wilcoxon test showing a > difference in the opposite direction to the difference in means. It's > probably easiest to think about this in terms of the Mann-Whitney > version of the same test, which is based on the proportion of pairs of > one observation from each group where the `a' observation is higher. > Your 'c' vector has a lot more zeros, so a randomly chosen observation > from 'c' is likely to be smaller than one from 'a', but the non-zero > observations seem to be larger, so the mean of 'c' is higher. > > The Wilcoxon test probably isn't very useful in a setting like this, > since its results really make sense only under 'stochastic ordering', > where the shift is in the same direction across the whole > distribution. > > -thomas
I was sure I had seen a definition where X was "larger than" Y if P(X>Y) > P(Y<X), but that's obviously not the normal definition. Anyways, it is worth emphasizing that that is what the Wilcoxon test tests for, not whether the means differ, nor whether the medians do. As a counterexample of the latter, try x <- rep(0:1, c(60,40)) y <- rep(0:1, c(80,20)) wilcox.test(x,y) median(x) median(y) (and the "location shift" reference in wilcox.test output is a bit of a red herring.) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.