You estimate a preference for each of those items, yes, in either
user-based or item-based recommendation. In item-based recommendation,
the estimate is a weighted average -- it's the user's preferences for
various items, weighted by their similarity to the given item.

In that case you don't need a neighborhood. The items of interest are
the user's preferred items -- and you want to use all of them, not a
subset.

It's not quite symmetrical with user-based recommendation, which is
based on user similarity. There, you need to constrain yourself to
examine only a subset of all users, a neighborhood, or else it would
be wildly inefficient.

But in item-based recommendation you don't have this issue. *Given an
item*, you already know the very small number of items it needs to be
compared to -- the user's preferred items. That takes the place of a
neighborhood in a sense.

You could say, well, then the problem is elsewhere: how can
considering all possible items for recommendation be efficient? if we
use neighborhoods to get around that in user-based, why not
item-based? In fact the algorithm doesn't actually look at every item
-- it constructs a set of items that are at all connected to any item
the user prefers, in order to rule out most items that can't possibly
be recommended.

In that sense a 'neighborhood' comes into play: the set of all items
considered is really the union of all maximal neighborhoods around any
item that the user prefers. That's a big neighborhood, and if this is
what you mean, you are correct that you could reasonably add
parameters to constrain that neighborhood.

The reasons maybe you don't want to do that are:

1) Item similarity is often 'fast' in that it is sometimes precomputed
based on outside information. So sorting through a lot of potential
items doesn't hurt much.

2) It's not part of the canonical item-based algorithm, but that's not
a great reason.

3) Computing this neighborhood gets expensive: it must be defined
based on distance to all items in the set, not one. That is, being far
from or near to one item doesn't mean anything by itself. It matters
how close it is to the whole set. By the time you're computing that...
might as well just use the canonical algorithm.

On Sat, Feb 20, 2010 at 11:22 AM, jamborta <[email protected]> wrote:
>
> but as far as I understand your implementation you take user1 and then get
> all the items
> that the user hasn't rated (getAllOtherItems()) and generate recommendation
> for each of these items. therefore, you have user1 item1, user1 item2, etc
> as input. so the neighbourhood can be restricted for each of these items.
>
> Tamas

Reply via email to