I am pretty sure that this is easily done, but I haven't had much time to work up a definitive answer.
In particular, we had really excellent results back in my academic days using log-likelihood stuff on genetic systems where I was looking at finding associations between different points in the genome. The idea was to look at the distribution of bases at point A and B for a bunch of sequences. That gave a bunch of 4x4 tables. LLR on these gave a very sensitive and informative signal for structure correlations. I am pretty sure that similar methods could apply to ratings since the question is whether the ratings on two items are correlated for all of the users who have rated either item. This would give you a 6 x 6 table for 5 rating levels (NA is a valid rating, of course). One issue is that you lose the ability to take the square root and apply a sign like you can with the 2x2 tables. That makes the interpretation of the score a bit trickier. On Tue, Jun 23, 2009 at 6:01 PM, Sean Owen <[email protected]> wrote: > I don't see how > mutual information is applied to this problem? >
