On Wed, Aug 6, 2014 at 5:07 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> On Wed, Aug 6, 2014 at 5:04 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > > > On Wed, Aug 6, 2014 at 6:01 PM, Dmitriy Lyubimov <dlie...@gmail.com> > > wrote: > > > > > > LLR is a classic test. > > > > > > > > > What i meant here it doesn't produce a p-value. or does it? > > > > > > > It produces an asymptotically chi^2 distributed statistic with 1-degree > of > > freedom (for our case of 2x2 contingency tables) which can be reduced > > trivially to a p-value in the standard way. > > > > Great. so that means that we can do h_0 rejection based on a %-expressed > level? > Yes. You can use LLR (aka G^2) to do hypothesis testing. But in the context we are using it, we are effectively doing millions or billions of repeated comparisons. Frequentist testing is hopeless in such situations and any p-values that you get will be meaningless (as p-values). Their only virtue is that they will roughly sort the cases with interesting and anomalous cases first. The raw score does that as well so that getting the p-value is just wasted computation. A classic hypothesis testing framework is also somewhat compromised by the fact that the LLR is only asymptotically chi^2 distributed. For very small counts, it is that close to chi^2 distributed, even though it is often dozens to hundreds of orders of magnitudes more accurate than Pearson's chi^2 test in such cases. This table [1] from my original paper [2] on the LLR test shows what I mean: [image: Inline image 1] For the data that we typically see, np << 1e-3 so the p-values estimated by conventional tests are really pretty horrible. Can you say a bit more about what you are trying to do? [1] https://dl.dropboxusercontent.com/u/36863361/llr-table.png [2] http://www.aclweb.org/anthology/J93-1003