Thanks Sean. That makes sense.
 
-- Young



At 2010-08-31 00:17:08,"Sean Owen" <[email protected]> wrote:

>You can do whatever you like if it works for you, but this sounds
>wrong to me. Yes you got more recommendations, but are those last
>recommendations actually good ones? The algorithm may be "telling you"
>there's not enough information to be sure about recommending many
>items.
>
>A neighborhood of hundreds of users is very large. It's such a crowd,
>that much of the neighborhood is undoubtedly "far" from the user. Yes,
>those are the nearest 1000 users, but perhaps 20 of them are really
>similar and the other 980 are introducing increasingly more noise in
>the computation.
>
>I would actually suggest you use a threshold-based neighborhood
>definition. The cutoff value depends on your similarity metric. If you
>use Pearson... maybe 0.5 or so?
>
>Yes, you may get fewer recommendations, but maybe that's good.
>
>(Another plug: if you are interested in this tradeoff, and evaluating
>metrics and such, this is all written up pretty thoroughly in Mahout
>in Action: http://manning.com/owen/)
>
>2010/8/30 Young <[email protected]>:
>> Hi Sean,
>> Thanks. When I expand the neighborsize into 1000, there are 80 items in 
>> common when giving 500 recommendations. That's quite reasonable and accepted.
>>
>> -- Young
>>
>>
>>
>>
>> At 2010-08-30 23:55:15,"Sean Owen" <[email protected]> wrote:
>>
>>>That result is quite possible. For example, with a user-based
>>>recommender, the only items that can possibly be recommended are those
>>>in the user's neighborhood. If the neighborhood is small, it's
>>>possible that only 23 unique items exist among users in that
>>>neighborhood. You can never get more recommendations than this.
>>>
>>>I don't think this result is "bad" per se, but if you want to try to
>>>get more recommendations, you really need more 'dense' data. Or,
>>>another algorithm may have different properties that are more
>>>desirable to you. Try SlopeOneRecommender.
>>>
>>>2010/8/30 Young <[email protected]>:
>>>> Hi all,
>>>> Based on 1M grouplens data, I tried to use user-based recommender and 
>>>> item-based recommender to give same user the recommendations. But the 
>>>> results vary so much. There are 4302 items in dataModel. For user 3 or 8, 
>>>> when returning 500 recommendeditems, there are only 23 items are in common.
>>>> In itembased recommender, I use PearsonCorrelationSimilarity.
>>>> In userbased recommender, I use NearestNNeighborhood (size 100), 
>>>> PearsonCorrelationSimilarity.
>>>> Should these results be accepted? Or what should I do to improve this 
>>>> situation?
>>>>
>>>> Thank you very much.
>>>>
>>>> -- Young
>>>>
>>

Reply via email to