Yes, you are right. It seems counter-intuitive, at first. I might
argue it is not so counter-intuitive, however.

A similarity of -1 is low, as low as possible. But the fact that the
two items have any similarity at all is significant. It means there
are a number of users who have rated both items, although they have
rated them quite differently. Note that most pairs of items have no
similarity whatsoever.

So a similarity of -1 is still in a way significant. The result is
less counter-intuitive when you think of it this way.

On Thu, Feb 11, 2010 at 8:35 PM, Guohua Hao <[email protected]> wrote:
> Hello Sean,
>
> First, I like your tweaks there.
>
> Based on your example, I came up with a new extreme case, which may cause
> some trouble. Suppose the user u has rated several items (e.g., 10 items)
> all with rating 5, and we want to predict user u's rating for item i, P_{u,
> i}. If item i 's similarities with all those already rated items are the
> same, which are very close to -1, we are still going to get P_{u,i} = 5,
> because those similarities factors will be canceled out. However, there is
> still counter-intuitive, since we expect P_{u, i} to be very close to 1 ( in
> the 1-5 rating range) with more confidence.
>
> Shall we consider this case in the code?
>
> Thanks,
> Guohua
>
> On Wed, Feb 10, 2010 at 6:13 PM, Sean Owen <[email protected]> wrote:
>
>> Yes, great point. It's bad if there's only one item that the user has
>> rated that has any similarity to the item being predicted. According
>> to even the 'corrected' formula, the similarity value doesn't even
>> matter. It cancels out. That leads to the counter-intuitive
>> possibility you highlight.
>>
>> For that reason GenericItemBasedRecommender won't make a prediction in
>> this situation. You could argue it's a hack but I feel it should be
>> undefined in this situation.
>>
>> You could certainly throw out 3.2.1 entirely and think up something
>> better, though I think with the two tweaks I've described here, its
>> core logic is simple and remains sound.
>>
>> Sean
>>
>>
>> On Thu, Feb 11, 2010 at 12:04 AM, Guohua Hao <[email protected]> wrote:
>> > I think you brought up a good point as to dealing with negative
>> > similarities, which I have not realized before. Here is my other thought.
>> > Based on your example and the proposed method, we will get a predicted
>> > rating of 5 in such case after normalization. This seems
>> counter-intuitive
>> > to me, since we know that these two items are very dissimilar (actually
>> > opposite correlated), a predicted rating close to 1 will be more
>> intuitive
>> > to me. Maybe we need to think more about the expression in section 3.2.1
>> of
>> > that paper.
>>
>

Reply via email to