I'm also getting only recently in the recommendation world, so I'll take advantage of this thread. Can we rephrase jamborta's idea by saying that an item's neighboorhood can be created by putting "around" an item all those items that have been rated by the same users with similar ratings? This assumptions should make them similar. Each item is a point in a multi-dimensional space where each dimension as a user and the "quantity" is the rating. In that space you have the concept of neighboorhood for items. What you can do now, is take your user's items and see what's near them.
This is probably a rephrase of user-based recommendation, though. Sean Owen wrote: > You estimate a preference for each of those items, yes, in either > user-based or item-based recommendation. In item-based recommendation, > the estimate is a weighted average -- it's the user's preferences for > various items, weighted by their similarity to the given item. > > In that case you don't need a neighborhood. The items of interest are > the user's preferred items -- and you want to use all of them, not a > subset. > > It's not quite symmetrical with user-based recommendation, which is > based on user similarity. There, you need to constrain yourself to > examine only a subset of all users, a neighborhood, or else it would > be wildly inefficient. > > But in item-based recommendation you don't have this issue. *Given an > item*, you already know the very small number of items it needs to be > compared to -- the user's preferred items. That takes the place of a > neighborhood in a sense. > > You could say, well, then the problem is elsewhere: how can > considering all possible items for recommendation be efficient? if we > use neighborhoods to get around that in user-based, why not > item-based? In fact the algorithm doesn't actually look at every item > -- it constructs a set of items that are at all connected to any item > the user prefers, in order to rule out most items that can't possibly > be recommended. > > In that sense a 'neighborhood' comes into play: the set of all items > considered is really the union of all maximal neighborhoods around any > item that the user prefers. That's a big neighborhood, and if this is > what you mean, you are correct that you could reasonably add > parameters to constrain that neighborhood. > > The reasons maybe you don't want to do that are: > > 1) Item similarity is often 'fast' in that it is sometimes precomputed > based on outside information. So sorting through a lot of potential > items doesn't hurt much. > > 2) It's not part of the canonical item-based algorithm, but that's not > a great reason. > > 3) Computing this neighborhood gets expensive: it must be defined > based on distance to all items in the set, not one. That is, being far > from or near to one item doesn't mean anything by itself. It matters > how close it is to the whole set. By the time you're computing that... > might as well just use the canonical algorithm. > > On Sat, Feb 20, 2010 at 11:22 AM, jamborta <[email protected]> wrote: > >> but as far as I understand your implementation you take user1 and then get >> all the items >> that the user hasn't rated (getAllOtherItems()) and generate recommendation >> for each of these items. therefore, you have user1 item1, user1 item2, etc >> as input. so the neighbourhood can be restricted for each of these items. >> >> Tamas >> > > -- Claudio Martella Digital Technologies Unit Research & Development - Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 [email protected] http://www.tis.bz.it Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to [email protected] in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.
