thanks a lot for the explanation. that makes sense. 


srowen wrote:
> 
> You estimate a preference for each of those items, yes, in either
> user-based or item-based recommendation. In item-based recommendation,
> the estimate is a weighted average -- it's the user's preferences for
> various items, weighted by their similarity to the given item.
> 
> In that case you don't need a neighborhood. The items of interest are
> the user's preferred items -- and you want to use all of them, not a
> subset.
> 
> It's not quite symmetrical with user-based recommendation, which is
> based on user similarity. There, you need to constrain yourself to
> examine only a subset of all users, a neighborhood, or else it would
> be wildly inefficient.
> 
> But in item-based recommendation you don't have this issue. *Given an
> item*, you already know the very small number of items it needs to be
> compared to -- the user's preferred items. That takes the place of a
> neighborhood in a sense.
> 
> You could say, well, then the problem is elsewhere: how can
> considering all possible items for recommendation be efficient? if we
> use neighborhoods to get around that in user-based, why not
> item-based? In fact the algorithm doesn't actually look at every item
> -- it constructs a set of items that are at all connected to any item
> the user prefers, in order to rule out most items that can't possibly
> be recommended.
> 
> In that sense a 'neighborhood' comes into play: the set of all items
> considered is really the union of all maximal neighborhoods around any
> item that the user prefers. That's a big neighborhood, and if this is
> what you mean, you are correct that you could reasonably add
> parameters to constrain that neighborhood.
> 
> The reasons maybe you don't want to do that are:
> 
> 1) Item similarity is often 'fast' in that it is sometimes precomputed
> based on outside information. So sorting through a lot of potential
> items doesn't hurt much.
> 
> 2) It's not part of the canonical item-based algorithm, but that's not
> a great reason.
> 
> 3) Computing this neighborhood gets expensive: it must be defined
> based on distance to all items in the set, not one. That is, being far
> from or near to one item doesn't mean anything by itself. It matters
> how close it is to the whole set. By the time you're computing that...
> might as well just use the canonical algorithm.
> 
> On Sat, Feb 20, 2010 at 11:22 AM, jamborta <[email protected]> wrote:
>>
>> but as far as I understand your implementation you take user1 and then
>> get
>> all the items
>> that the user hasn't rated (getAllOtherItems()) and generate
>> recommendation
>> for each of these items. therefore, you have user1 item1, user1 item2,
>> etc
>> as input. so the neighbourhood can be restricted for each of these items.
>>
>> Tamas
> 
> 

-- 
View this message in context: 
http://old.nabble.com/item-based-recommendation-neighbourhood-size-tp27661482p27666452.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to