Re: [R] Simple General Statistics and R question (with 3 line example) - get z value from pairwise.wilcox.test

peter dalgaard Wed, 04 May 2011 07:34:28 -0700

On May 4, 2011, at 15:11 , JP wrote:

> Peter thanks for the fantastically simple and understandable explanation...
> 
> To sum it up... to find the z values of a number of pairwise wilcox
> tests do the following:
> 
> # pairwise tests with bonferroni correction
> x <- pairwise.wilcox.test(a, b, alternative="two.sided",
> p.adj="bonferroni", exact=F, paired=T)



You probably don't want the bonferroni correction there. Rather p.adj="none". 
You generally correct the p values for multiple testing, not the test 
statistics.

(My sentiment would be to pick apart the stats:::wilcox.test.default function 
and clone the computation of Z from it, but presumably backtracking from the p 
value is a useful expedient.)

> # what is the data structure we got back
> is.matrix(x$p.value)
> # p vals
> x$p.value
> # z.scores for each
> z.score <- qnorm(x$p.value / 2)
> 

Hmm, you're not actually getting a signed z out of this, you might want to try 
alternative="greater" and drop the division by 2 inside qnorm(). (If the signs 
come out inverted, I meant "less" not "greater"...)


> 
> 
> On 4 May 2011 13:25, peter dalgaard <pda...@gmail.com> wrote:
>> 
>> On May 4, 2011, at 11:03 , JP wrote:
>> 
>>> On 3 May 2011 20:50, peter dalgaard <pda...@gmail.com> wrote:
>>>> 
>>>> On Apr 28, 2011, at 15:18 , JP wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> I have found that when doing a wilcoxon signed ranked test you should 
>>>>> report:
>>>>> 
>>>>> - The median value (and not the mean or sd, presumably because of the
>>>>> underlying potential non normal distribution)
>>>>> - The Z score (or value)
>>>>> - r
>>>>> - p value
>>>>> 
>>>> 
>>>> ...printed on 40g/m^2 acid free paper with a pencil of 3B softness?
>>>> 
>>>> Seriously, with nonparametrics, the p value is the only thing of real 
>>>> interest, the other stuff is just attempting to check on authors doing 
>>>> their calculations properly. The median difference is of some interest, 
>>>> but it is not actually what is being tested, and in heavily tied data, it 
>>>> could even be zero with a highly significant p-value. The Z score can in 
>>>> principle be extracted from the p value (qnorm(p/2), basically) but it's 
>>>> obviously unstable in the extreme cases. What is r? The correlation? 
>>>> Pearson, not Spearman?
>>>> 
>>> 
>>> Thanks for this Peter - a couple of more questions:
>>> 
>>> a <- rnorm(500)
>>> b <- runif(500, min=0, max=1)
>>> x <- wilcox.test(a, b, alternative="two.sided", exact=T, paired=T)
>>> x$statistic
>>> 
>>>    V
>>> 31835
>>> 
>>> What is V? (is that the value Z of the test statistic)?
>> 
>> No. It's the sum of the positive ranks:
>> 
>>        r <- rank(abs(x))
>>        STATISTIC <- sum(r[x > 0])
>>        names(STATISTIC) <- "V"
>> 
>> (where x is actually x-y in the paired case)
>> 
>> Subtract the expected value of V (sum(1:500)/2 == 62625) in your case, and 
>> divide by the standard deviation (sqrt(500*501*1001/24)=3232.327) and you 
>> get Z=-9.54. The slight discrepancy is likely due to your use of exact=T (so 
>> your p value is not actually computed from Z).
>> 
>> 
>>> 
>>> z.score <- qnorm(x$p.value/2)
>>> [1] -9.805352
>>> 
>>> But what does this zscore show in practice?
>> 
>> 
>> That your test statistic is approx. 10 standard deviations away from its 
>> mean, if the null hypothesis were to be true.
>> 
>> 
>>> 
>>> The d.f. are suggested to be reported here:
>>> http://staff.bath.ac.uk/pssiw/stats2/page2/page3/page3.html
>>> 
>> 
>> Some software replaces the asymptotic normal distribution of the rank sums 
>> with the t-distribution with the same df as would be used in an ordinary t 
>> test. However, since there is no such thing as an independent variance 
>> estimate in the Wilcoxon test, it is hard to see how that should be an 
>> improvement. I have it down to "coding by non-statistician".
>> 
>> 
>>> And r is mentioned here
>>> http://huberb.people.cofc.edu/Guide/Reporting_Statistics%20in%20Psychology.pdfs
>>> 
>>> 
>> 
>> Aha, so it's supposed to be the effect size. On the referenced site they 
>> suggest to use r=Z/sqrt(N). (They even do so for the independent samples 
>> version, which looks wrong to me).
>> 
>>> 
>>>>> My questions are:
>>>>> 
>>>>> - Are the above enough/correct values to report (some places even
>>>>> quote W and df) ?
>>>> 
>>>> df is silly, and/or blatantly wrong...
>>>> 
>>>>>  What else would you suggest?
>>>>> - How do I calculate the Z score and r for the above example?
>>>>> - How do I get each statistic from the pairwise.wilcox.test call?
>>>>> 
>>>>> Many Thanks
>>>>> JP
>>>>> 
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>>> --
>>>> Peter Dalgaard
>>>> Center for Statistics, Copenhagen Business School
>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>> Phone: (+45)38153501
>>>> Email: pd....@cbs.dk  Priv: pda...@gmail.com
>>>> 
>>>> 
>> 
>> --
>> Peter Dalgaard
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Email: pd....@cbs.dk  Priv: pda...@gmail.com
>> 
>> 
>> 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple General Statistics and R question (with 3 line example) - get z value from pairwise.wilcox.test

Reply via email to