[R] Weighted Mann-Whitney-Wilcoxon-Test

2014-08-26 Thread Alexander Sommer
On Tuesday, August 19, 2014 9:46 PM Thomas Lumley tlum...@uw.edu wrote:

 Is anyone aware of an(other) implementation in R?
 survey::svyranktest

Oh, that easy. Thanks a lot.

Actually, in my case the weights do not derive from some selection 
probabilities, but your test works anyway.


For completeness (and this time with a working example):

set.seed(seed = 123)
count.x - NULL
count.y - NULL
j - 1
for (i in sample(x = 10:20, size = 20, replace = TRUE)){
 count.x[j] - sample(x = 0:i, size = 1)
 count.y[j] - i - count.x[j]
 j  - j + 1
}
data - data.frame(x.portion = (count.x/(count.x + count.y)),
   y.portion = (count.y/(count.x + count.y)),
   group = c(rep(A, 12), rep(B, 8)),
   weight= (count.x + count.y)
  )

I first considered the unweighted case with

library(package = survey)
design - svydesign(ids = ~0, data = data)
svyranktest(formula = x.portion ~ group, design = design)

and compared it to the default Wilcoxon test by

wilcox.test(formula = x.portion ~ group, data = data, exact = FALSE, correct = 
FALSE)

Or, if you prefer

library(package = coin)
wilcox_test(formula = x.portion ~ group, data = data)

The resulting p-values differ, as I understood due to an approximation in 
package /survey/.

Now, finally, the weighted case:

design - svydesign(ids = ~0, data = data, weights = ~weight)
svyranktest(formula = x.portion ~ group, design)

And, by the way, package /survey/ seems also to be the preferable way, if you 
want to go for a parametric test. Once again, the unweighted case first:

design - svydesign(ids = ~0, data = data)
svyttest(formula = x.portion ~ group, design)

And, yet again, the results differ from the default t test

t.test(formula = x.portion ~ group, data = data)

This time, I guess, it is due to the way standard errors are computed.

Finally (this time for real), the weighted case:

design - svydesign(ids = ~0, data = data, weights = ~weight)
svyttest(x.portion ~ group, design)

Note that function /wtd.t.test/ from package /weights/ depends on the scale of 
the weights, /svyttest/ not.


Thomas, one more time: thank you for your help.

Cheers,

Alex


-- 
Alexander Sommer
wissenschaftlicher Mitarbeiter

Technische Universität Dortmund 
Fakultät Erziehungswissenschaft, Psychologie und Soziologie
Forschungsverbund Deutsches Jugendinstitut/Technische Universität Dortmund
Vogelpothsweg 78
44227 Dortmund

Telefon: +49 231 755-8189
Fax: +49 231 755-6553
E-Mail:  alexander.som...@tu-dortmund.de
WWW: http://www.forschungsverbund.tu-dortmund.de/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Mann-Whitney-Wilcoxon-Test

2014-08-19 Thread Alexander Sommer
Hi fellow R-users,

well, say I got two groups, A and B. Nested within each group are subgroups and 
in each subgroup are objects with values x and y to a certain attribute. So, I 
can compute the portion of x-objects for each subgroup as #x/(#x + #y).

Artificial example with 12 subgroups in group A and 8 subgroups in group B:

set.seed(123)
count.x - NULL
count.y - NULL
j - 1
for (i in sample(x = 10:20, size = 20, replace = TRUE)){
 count.x[j] - sample(x = 0:i, size = 1)
 count.y[j] - i - count.x[j]
 j  - j + 1
}
data - data.frame(x.portion = (count.x/(count.x + count.y)),
   x.portion = (count.y/(count.x + count.y)),
   group = c(rep(A, 12), rep(B, 8),
   weight= (count.x + count.y)
  )

I am now interested in whether or not there is a difference in the portions of 
x-objects between group A and B and consider it a good idea – as seen in the 
example above – to weight for the total number of objects in each subgroup. 
Given data that is not considered a realization of some normal distribution, 
thinking of a test that uses ranks still does not look like a natural solution 
to this problem. But I guess it is possible. Though, Xie  Priebe (2002)* are 
not exactly aiming at this, their paper might give an idea how weighting may 
look like in the special case of the Mann/Whitney/Wilcoxon statistic. (Despite 
the hint by John  Priebe (2007)** that this “is not a candidate for the 
practitioner’s toolbox”.)

Anyway, trying to apply the Wilcoxon rank sum test to weighted data, I was 
first tempted to replicate each portion by its weight. (Bad idea: data bloat, 
ties and probably a number of problems even worse.) Function wilcox_test in 
package coin has got a weight argument, but

library(coin)
wilcox_test(formula = x.portion ~ group, data = data, weight = ~ weight)

leads to warning “Rank transformation doesn’t take weights into account”. 
Though, results differ from

wilcox_test(formula = x.portion ~ group, data = data)

and

wilcox.test(formula = x.portion ~ group, data = data)

The code in wilcox_test() and the functions it depends on looks a little bit 
interlaced to me, but I guess it is not what I am after.

tl;dr: I am looking for a nonparametric alternative to wtd.t.test in package 
weights.

Is anyone aware of an(other) implementation in R?

Cheers,

Alex


PS: For real, it is a little bit trickier as there are more than two values and 
maybe even more than two groups. So Kruskal-Wallis test might be of interest, 
but I thought I keep it simple for the moment.

* Jingdong Xie  Carey E. Priebe: A weighted generalization of the 
Mann–Whitney–Wilcoxon statistic. In: Journal of Statistical Planning and 
Inference 102 (2), 2002-04-01, pages 441–466. 
(http://dx.doi.org/10.1016/S0378-3758(01)00111-2.)
** Majnu John  Carey E. Priebe: A data-adaptive methodology for finding an 
optimal weighted generalized Mann-Whitney-Wilcoxon statistic. In: Computational 
Statistics  Data Analysis 51 (9), 2007-05-15, pages 4337–4353. 
(http://dx.doi.org/10.1016/j.csda.2006.06.003.)


-- 
Alexander Sommer
wissenschaftlicher Mitarbeiter

Technische Universität Dortmund 
Fakultät Erziehungswissenschaft, Psychologie und Soziologie
Forschungsverbund Deutsches Jugendinstitut/Technische Universität Dortmund
Vogelpothsweg 78
44227 Dortmund

Telefon: +49 231 755-8189
Fax: +49 231 755-6553
E-Mail:  alexander.som...@tu-dortmund.de
WWW: http://www.forschungsverbund.tu-dortmund.de/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted Mann-Whitney-Wilcoxon-Test

2014-08-19 Thread Thomas Lumley
]On Wed, Aug 20, 2014 at 2:01 AM, Alexander Sommer
alexander.som...@tu-dortmund.de wrote:

 tl;dr: I am looking for a nonparametric alternative to wtd.t.test in package 
 weights.

 Is anyone aware of an(other) implementation in R?


survey::svyranktest

T. Lumley and A.J. Scott (2013). Two-sample rank tests under complex
sampling. Biometrika, 100, 831-842.


   -thomas


-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.