On Mon, Nov 9, 2009 at 4:57 AM, Sean Owen <[email protected]> wrote:

> Ted will say, and I again I agree, that Pearson is not usually the
> best similarity metric, though it is widely mentioned in collaborative
> filtering examples and literature.
>

You said it!  I don't need to.


>  What Ted quotes below is implemented in the framework as
> LogLikelihoodSimilarity. For that, I believe it *is* the pairs with
> the largest resulting similarity score that you do want to keep. Or at
> least it is more reasonable. Ted maybe you can check my thinking on
> that.
>

Yes.  And you don't even need the score in the end, just the fact that it
passed the threshold.  I typically weight the pairing by IDF score of the
source item.



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to