[R] Weighted Mann-Whitney-Wilcoxon-Test
On Tuesday, August 19, 2014 9:46 PM Thomas Lumley tlum...@uw.edu wrote: Is anyone aware of an(other) implementation in R? survey::svyranktest Oh, that easy. Thanks a lot. Actually, in my case the weights do not derive from some selection probabilities, but your test works anyway. For completeness (and this time with a working example): set.seed(seed = 123) count.x - NULL count.y - NULL j - 1 for (i in sample(x = 10:20, size = 20, replace = TRUE)){ count.x[j] - sample(x = 0:i, size = 1) count.y[j] - i - count.x[j] j - j + 1 } data - data.frame(x.portion = (count.x/(count.x + count.y)), y.portion = (count.y/(count.x + count.y)), group = c(rep(A, 12), rep(B, 8)), weight= (count.x + count.y) ) I first considered the unweighted case with library(package = survey) design - svydesign(ids = ~0, data = data) svyranktest(formula = x.portion ~ group, design = design) and compared it to the default Wilcoxon test by wilcox.test(formula = x.portion ~ group, data = data, exact = FALSE, correct = FALSE) Or, if you prefer library(package = coin) wilcox_test(formula = x.portion ~ group, data = data) The resulting p-values differ, as I understood due to an approximation in package /survey/. Now, finally, the weighted case: design - svydesign(ids = ~0, data = data, weights = ~weight) svyranktest(formula = x.portion ~ group, design) And, by the way, package /survey/ seems also to be the preferable way, if you want to go for a parametric test. Once again, the unweighted case first: design - svydesign(ids = ~0, data = data) svyttest(formula = x.portion ~ group, design) And, yet again, the results differ from the default t test t.test(formula = x.portion ~ group, data = data) This time, I guess, it is due to the way standard errors are computed. Finally (this time for real), the weighted case: design - svydesign(ids = ~0, data = data, weights = ~weight) svyttest(x.portion ~ group, design) Note that function /wtd.t.test/ from package /weights/ depends on the scale of the weights, /svyttest/ not. Thomas, one more time: thank you for your help. Cheers, Alex -- Alexander Sommer wissenschaftlicher Mitarbeiter Technische Universität Dortmund Fakultät Erziehungswissenschaft, Psychologie und Soziologie Forschungsverbund Deutsches Jugendinstitut/Technische Universität Dortmund Vogelpothsweg 78 44227 Dortmund Telefon: +49 231 755-8189 Fax: +49 231 755-6553 E-Mail: alexander.som...@tu-dortmund.de WWW: http://www.forschungsverbund.tu-dortmund.de/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted Mann-Whitney-Wilcoxon-Test
Hi fellow R-users, well, say I got two groups, A and B. Nested within each group are subgroups and in each subgroup are objects with values x and y to a certain attribute. So, I can compute the portion of x-objects for each subgroup as #x/(#x + #y). Artificial example with 12 subgroups in group A and 8 subgroups in group B: set.seed(123) count.x - NULL count.y - NULL j - 1 for (i in sample(x = 10:20, size = 20, replace = TRUE)){ count.x[j] - sample(x = 0:i, size = 1) count.y[j] - i - count.x[j] j - j + 1 } data - data.frame(x.portion = (count.x/(count.x + count.y)), x.portion = (count.y/(count.x + count.y)), group = c(rep(A, 12), rep(B, 8), weight= (count.x + count.y) ) I am now interested in whether or not there is a difference in the portions of x-objects between group A and B and consider it a good idea – as seen in the example above – to weight for the total number of objects in each subgroup. Given data that is not considered a realization of some normal distribution, thinking of a test that uses ranks still does not look like a natural solution to this problem. But I guess it is possible. Though, Xie Priebe (2002)* are not exactly aiming at this, their paper might give an idea how weighting may look like in the special case of the Mann/Whitney/Wilcoxon statistic. (Despite the hint by John Priebe (2007)** that this “is not a candidate for the practitioner’s toolbox”.) Anyway, trying to apply the Wilcoxon rank sum test to weighted data, I was first tempted to replicate each portion by its weight. (Bad idea: data bloat, ties and probably a number of problems even worse.) Function wilcox_test in package coin has got a weight argument, but library(coin) wilcox_test(formula = x.portion ~ group, data = data, weight = ~ weight) leads to warning “Rank transformation doesn’t take weights into account”. Though, results differ from wilcox_test(formula = x.portion ~ group, data = data) and wilcox.test(formula = x.portion ~ group, data = data) The code in wilcox_test() and the functions it depends on looks a little bit interlaced to me, but I guess it is not what I am after. tl;dr: I am looking for a nonparametric alternative to wtd.t.test in package weights. Is anyone aware of an(other) implementation in R? Cheers, Alex PS: For real, it is a little bit trickier as there are more than two values and maybe even more than two groups. So Kruskal-Wallis test might be of interest, but I thought I keep it simple for the moment. * Jingdong Xie Carey E. Priebe: A weighted generalization of the Mann–Whitney–Wilcoxon statistic. In: Journal of Statistical Planning and Inference 102 (2), 2002-04-01, pages 441–466. (http://dx.doi.org/10.1016/S0378-3758(01)00111-2.) ** Majnu John Carey E. Priebe: A data-adaptive methodology for finding an optimal weighted generalized Mann-Whitney-Wilcoxon statistic. In: Computational Statistics Data Analysis 51 (9), 2007-05-15, pages 4337–4353. (http://dx.doi.org/10.1016/j.csda.2006.06.003.) -- Alexander Sommer wissenschaftlicher Mitarbeiter Technische Universität Dortmund Fakultät Erziehungswissenschaft, Psychologie und Soziologie Forschungsverbund Deutsches Jugendinstitut/Technische Universität Dortmund Vogelpothsweg 78 44227 Dortmund Telefon: +49 231 755-8189 Fax: +49 231 755-6553 E-Mail: alexander.som...@tu-dortmund.de WWW: http://www.forschungsverbund.tu-dortmund.de/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted Mann-Whitney-Wilcoxon-Test
]On Wed, Aug 20, 2014 at 2:01 AM, Alexander Sommer alexander.som...@tu-dortmund.de wrote: tl;dr: I am looking for a nonparametric alternative to wtd.t.test in package weights. Is anyone aware of an(other) implementation in R? survey::svyranktest T. Lumley and A.J. Scott (2013). Two-sample rank tests under complex sampling. Biometrika, 100, 831-842. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.