Hi all
newbie question.
I've generated a very simple script to process gene expression data
from mysql output.
One of the things I'm trying to do is feed about 17000 vectors that I
generate in this script into a wilcoxon test through rpy2, and
collecting the p value as output.
Some of the vectors are quite long. I begin to see truncated returns
e.g.:
'Wilcoxon rank sum test with'
or
'data: (3, 3, 3, 3, 3, 3, 3, 3,'
once the vector length hits 1500, and things like the examples above
are uniformly returned once the vector length hits around 4000. These
are of course the beginning of the R output, which looks like this:
Wilcoxon rank sum test with continuity correction
data: t(a[1]) and t(a[3])
W = 3052972, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
which makes me believe that the test has actually been run.
I've checked all my inputs and everything is peachy on the Python
side. Given the high correlation of the bug with the vector length, I
think the most likely explanations are the following:
(1) vector is too long and this causes some breakdown when R and
Python are talking to each other
(2) test takes too long and i need to somehow slow the script down to
wait for the full output.
Any ideas or help would be appreciated.
Joel
------------------------------------------------------------------------------
_______________________________________________
rpy-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rpy-list