Re: co-occurrence paper and code

Ted Dunning Wed, 06 Aug 2014 16:58:29 -0700

On Wed, Aug 6, 2014 at 5:49 PM, Dmitriy Lyubimov <[email protected]> wrote:

> On Wed, Aug 6, 2014 at 4:21 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> I suppose in that context LLR is considered a distance (higher scores mean
> > more `distant` items, co-occurring by chance only)?
> >
>
> Self-correction on this one -- having given a quick look at llr paper
> again, it looks like it is actually a similarity (higher scores meaning
> more stable co-occurrences, i.e. it moves in the opposite direction of
>  p-value if it had been a classic  test
>

LLR is a classic test.  It is essentially Pearson's chi^2 test without the
normal approximation.  See my papers[1][2] introducing the test into
computational linguistics (which ultimately brought it into all kinds of
fields including recommendations) and also references for the G^2 test[3].

[1] http://www.aclweb.org/anthology/J93-1003
[2] http://arxiv.org/abs/1207.1847
[3] http://en.wikipedia.org/wiki/G-test

Re: co-occurrence paper and code

Reply via email to