On Wed, Oct 1, 2008 at 3:49 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > If I understand the article correctly, they go through several possible > approaches to building REs - user-item CF -- too expensive, user clustering > --- low quality, search-based -- low quality and then present their novel > item-to-item CF as something that doesn't have performance/scale/quality > problems. I think they said it works as my example above, but it doesn't > make sense to me - by doing the above all I'm doing is looking at similarity > between items that Joe already consumed, and I'm really after *new* things to > recommend.
I have not seen this before. I think I just read somewhere that the item-based system I outlined (as does Collective Intelligence) is "what Amazon does" and that could be wrong. They first say, basically, compute only as much of the item-item similarity matrix as makes sense -- if nobody ever bought X and Y together, don't bother computing it. OK. As far as my stuff goes -- similarities are not all precomputed in the interest of time but are cached as much as memory permits. Taking it to an extreme and having a system that can cache it all -- yeah, that's going to help speed! It says... "Given a similar-items table, the algorithm finds items similar to each of the user's purchases and ratings, aggregates those items, and then recommends the most popular or correlated items." It's not clear what exactly this means, whether the top recommendations are merely items that are most similar to *something* you bought/rated, or most similar *on average to everything you bought/rated*, or is a weighted average like we were discussing. I kinda think it has to be the latter for two reasons: 1) If I hated something, I don't want to recommend things like it! so rating seems to matter 2) Taking into account rating does not appreciably slow anything down >> > Is this interpretation correct? If it is, I don't quite yet see the >> > benefit >> of computing similarity between items for a single person. Of course they >> will >> be similar, unless we want to create groups of items for a single person >> with a >> diverse set of interests for some reason. Oh, no I think they are saying they compute similarity between all pairs of items in the catalog -- well, all pairs that actually co-occur in some user's purchase/rating history. > In my case I'm interested in working with binary preferences, though, so if I > understand things correctly weighted average doesn't really make sense in > that context. Is this correct? Yeah, er, it still works to an extent but not really meaningfully. It seems that other approaches are more appropriate and faster in this special case. This, I haven't dug into much. CI covers it reasonably well. You or I could sit down and adapt the common approaches to this special case in any number of clever ways, I'm sure. > This sounds exactly like what's described on pages 22-25 in Programming > Collective Intelligence. My understanding was that what's in PCI is "an old > known approach" whereas what is in that article about Amazon is a "new, > different, and patented approach". Are they both actually talking about the > same algo? i.e. Amazon article is the same as what's on p22-25 in PCI? This I don't know. This was my impression, that they're basically doing plain ol' item-based recommendation -- with a heap of tweaks and heuristics and so on internally, no doubt, that they aren't going to publish. For example, one area where one can add a lot of proprietary value is translating user actions (purchases, ratings, views) into the "right" inferred preference value. > OK, so the summary is that you still have to compute item1-item2 similarity > matrix where item1 comes from the set of items consumed by Joe and item2 > comes from the set of items I want to consider for recommendation (i.e. it > can be only a subset of all items, e.g. only new items since the last time > Joe was recommended some items 3 days ago). > > If my assumption that computing weighted average doesn't make sense in the > context of binary preferences, then the above is really it - it all boils > down to computing item1-item2 matrix based on item-item similarity. > While doing that we try to make that item1-item2 as small as possible and try > to compute it ahead of time, so that at the "give Joe N recommendations" time > we can simply take topN from that1-item2 matrix. > Is this correct or am I missing some pieces of the puzzle? Yeah that is my understanding. They build as much of the complete item-item matrix as makes sense ahead of time. The only issue with that is, well, you can't just go update it every time a pref changes. So either you don't immediately respond to new ratings, or you have to do so in some clever other way. For my stuff, everything has an nearly immediate effect -- updates take effect immediately but due to caching may not percolate up the recommender system and have an effect for a bit. This is a sort of middle ground between recomputing everything all the time, and never. (And it is still a problematic way to approach this.)
