Hi. Thanks to all those who contributed to the discussion on my original query attached below. Most of you requested more information, so I'll try and flesh out the problem here. The actual computation I'm attempting is a genetic linkage disequilibrium (LD) test, with the null hypothesis that no LD exists between two genetic sites (loci) i.e. the alleles (genes) at one locus appear randomly with respect to the second locus. An EM algorithm is used to get the likelihood of the observed data (L1). The likelihood under the assumption of no LD is also calculated (L0), and from this a statistic S=2*ln(L1/L0) is calculated. For data with low genetic diversity and large sample sizes, S has a chi-square distribution, and the P-value is thereby obtained. For more complex cases the alleles at one locus were permuted 1000 times (or more), and for each permutation S was calculated and the P-value for the test was the proportion of replicates that produced values of S equal or greater than the original S. In both cases I bootstrapped the original data 1000 times to get the CIs. This was done for a number of pairs of loci. To illustrate what I meant by "narrow" and "enormous", here are some examples with the p-value, the median of the bootstrap values and the upper and lower limits of 95% CI:
Locus pair P-value Median Lower Upper 1 0.002414 0.002422 0.000055 0.037401 2 0.971621 0.512296 0.181935 0.850761 3 0.000000 0.000000 0.000000 0.000000 4 0.018936 0.016100 0.001082 0.193285 5 0.832001 0.505662 0.173286 0.857703 Judging by the comments on the relationship between p-values and confidence intervals, I have a suspicion that going through the computationally expensive process of bootstrapping is not actually teaching me anything new, but I would like to get some sort of estimate of error associated with the computed p-values. Some mention has been made of the variance being proportional to p(1-p) - is the constant of proportionality easy to compute? Presumably the number of permutations is a factor. Regards, Enda > Hi. I have a query regarding whether it is logical to place a > confidence interval about a P-value. The computation involved uses a > permutation method to produce a P-value for a hypothesis test. In an > effort to check the reliability of this P-value I have been > bootstrapping the raw data to produce a confidence interval about this > P-value. Curiously, for significant P-values the CI is in general > quite narrow, whereas for non-significant values, I get enormous CIs. > This leads me to the suspicion that there is something flawed about > the process. Am I correct in my suspicions? --------------------------------------------------------------------- E-mail Confidentiality Notice and Disclaimer This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to which they are addressed. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. E-mail messages are not necessarily secure. Hitachi does not accept responsibility for any changes made to this message after it was sent. Please note that Hitachi checks outgoing e-mail messages for the presence of computer viruses. --------------------------------------------------------------------- . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
