Vijaykumar Muley wrote: > > Dear all, > > Myself Vijaykumar Muley working as senior research fellow. By training I > am > a computational biologist with not a strong knowledge of statistics. I > have > done some analysis which is explained as follows, > > I have 10340 (X) profiles of binary vectors with same length(N=845), I > will > call then "gene profiles" > for example... > > v1 v2 v3 v4.....vN > a 1 0 1 0 1 > b 0 0 1 0 0 > c 1 0 1 1 1 > d 0 1 1 1 1 > e 0 0 1 1 1 > . . . . ........ > . . . . ........ > . . . . ........ > upto > 10340 > > > then I have some other binary profiles with same length (N=845), here I > will > call then "expression profile"; > v1 v2 v3 v4.....vN > f1 1 0 1 0 1 > f2 0 0 1 0 0 > f3 1 0 1 1 1 > > > now I am comparing profile f1 with all X profiles using hypergeometic > distribution function. What I am getting is p-value(probability) of the > similarity between profile f1 and all X profiles i.e. 10340 by random > chance > alone. > > for example, > > #pair p-value > > f1,a 1e-20 > f1,b 0.01 > . > . > upto > f1,10340 0.05 > > same thing i am doing with f2 and f3. > > if we arrange this data(output) in better readable format, it looks like > > f1 f2 f3 > a 1e-20 0.01 0.10 > b 0.01 1e-9 0.02 > c 1e-3 0.1 0.30 > d 0.03 0.07 1e-5 > e 1e-1 0.01 1e-9 > . . . . ........ > . . . . ........ > . . . . ........ > upto > 10340 > > > I hope everyone understood what type of output I am getting. > > Now I want to perform multiple hypothesis comparision(P-value adjustment) > on > this data , so that I will get the statistically significant associations > between various "expression profiles" and "gene profiles" at specific > alpha > level; > > Most conservative method for p-value adjustment is bonferroni and many > others with less conservation, I dont care which method I use but the > problem here is > > according to what parameter I should use for correct or adjust p-values ?. > > so in case of Bonferroni correction, > should I multiply the each p-value with 10340 or > as I have compared 3 expression profiles against 10340 gene profiles, > should > I multiply p-value with 3*10340 > > I am aksing this for understanding. What I want to do is > >>From the above gene, p-value table, I want to calculate the percentage of > false positive rate at each p-values from 0.0001 to 0.05 > So that I can use a good cutoff as significance level (alpha) to exclude > the > gene profiles which are weakly associated with all expression profiles. > (If I am correct, to do this I need to use other p-value correction > methods, > either simulation based, resampling or > Benjamini and Hochberg (B&H). > > Please can any one suuggests me about p-value adjustment or p-value > correction, I mean statistically or technically which number should I > consider for correction, 10340 or 3 * 10340, as I have three features to > associate with same 10340 gene set. or if I am wrong, can any one tell me > the protocol which I should refer to get fair number of significant > associations between genes and expression profiles. > > I am using package "multtest" for p-value adjustment but literally I am > not > getting for correction, > should I give p-values for each expression profile alone or give it all > p-values ie. 3*10340. > > I have gone through many tutorials and articles for multiple hypothesis > testing but really couldnt get exactly, what is it. > > Please give me some clues, some of you may be actively working on p-value > adjustment / multiple hypothesis testing, I expect some suggestions. > > I will be grateful for you kind help. > > sincerely, > > Please do NOT reply to a digest when posting to the list, you should start a new thread (or at the very least delete the digest to which you are replying from your email). You may be interested False Discovery Rate (FDR) methods proposed by Benjamini & Hochberg[1] and various related work/papers/software[2][3] Neil [1] Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist Soc B 57:289-300 [2] http://genomics.princeton.edu/storeylab/qvalue/ -- View this message in context: http://www.nabble.com/multiple-hypothesis-testing-tp22512331p22557450.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.