[R] analysis and figure for sign test in setting of high inter-experiment variance

Robin Colgrove Mon, 06 Jun 2005 08:30:35 -0700

Hello all,

Sorry if this is an FAQ. I have been trying to search the archiveswithout success.I have a dataset (ChiPs microarray) where the experiment to experimentvariability is very highbut where within an experiment, the data nearly always goes in the"right" (hypothesis confirming) direction.I am trying to figure out the right way to use R to do the statisticalanalysis and generate an appropriate figure.

To be specific, we have a virus and a mutant derivative, and thehypothesis is that the wild type virus is specifically suppressingtranscription activity in a manner that is abrogated in the mutant.

The experiment is to measure the amount of viral chromatin associatedwith overall histone (should be the same between wild type and mutant),vs. transcriptionally active chromatin (mutant should be greater thanwild type) vs. inactive chromatin (wild type should be greater thanmutant).

For each datapoint there four variables: a histone type (general,active, inactive), a specific gene assayed (four different genes), avirus used for infection (wild type or mutant), and an experimentnumber (each combination repeated 3-5 times) These are hard experimentsto do (involving dissecting out small numbers of cells from a mouse) sothe numbers are small, but in each case, there are 3-5 pairs of wildtype vs mutant virus for each condition.

If I look at simply whether the hypothesis is confirmed for eachcondition (whether the wild type/mutant difference goes the way youwould expect), then the sign is right 34/35 times, which is way beyondreasonable significance. However, since the inter-experiment varianceis so high, if I try to do a simple rank-sum test for a particularchromatin-gene-virus combination (3-5 pairs), the result is usuallynon-significant (never significant if Bonferroni corrections formultiple tests are applied).


My questions are:

1) Is there a good way within R to do and report a simple sign test onthis sort of data (paired samples, non-normal, high-inter experimentvariability)?

2) What would be a good way to plot this sort of noisy data (and how todo it in R)? I was thinking of having each (wildtype-mutant) experimentpair as the ends of line segments with different colors or line-typesfor each gene-chromatin type combination. I know this is a standardkind of plot but I can't figure out how to do it in R.

3) What is the best way to input this data? A 4d array with virus type(wild type or mutant) on one axis, chromatin type (non-specific,active, inactive) on the second, gene (one of four different genes) onthe third, and experiment number (1-5) on the last? Is there a good wayto do this with data frames?

Thanks for any help or pointers to appropriate how-to's. I am reallytrying to figure this out myself, but as a virologist/bioinformaticistnew to R, I still have a lot to learn statistics-wise.


robin colgrove
dept. of microbiology
harvard medical school

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] analysis and figure for sign test in setting of high inter-experiment variance

Reply via email to