On Sat, Nov 28, 2009 at 10:57 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> Restricted Boltzmann are of real interest, but again, I repeat the > obligatory warning about replicating all things from the Netflix > competition. > Totally agree in principle, but I'm not saying everyone should use the best techniques from the Netflix challenge, at it is, as you say, just minimizing RMS over a particular recommendation type (one where popularity matters a lot, blockbusters are a big effect, etc...), but I have a feeling that as Mahout gets more popular, people are going to be wanting to try out Netflix techniques (whether they work on their data sets or not!), and will be asking whether we have them. > To take a few concrete examples, > > - user biases were a huge advance in terms of RMS error, but they don't > affect the ordering of the results presented to a user and thus are of no > interest for almost all production recommender applications > Do you really think that correcting for user-bias doesn't affect ordering of results? The simplest possible thing (just making sure the final predicted ratings for a given user take into account the user's bias) certainly doesn't affect the ordering, but even this affects your judgement of whether the user will "like" the movie, right? More importantly, there were other techniques regarding user-bias which I seem to recall which are good for not just RMS, but ordering effects: the fact that there is a nonlinear response in ratings - if a user has an average rating of 3 (or we've adjusted for their bias), then it usually turns out that a rating of 1 implies they hate said item *far* more than the amount to which they rate a 5 means they love it (ie. user bias is not just a centering effect). To put it another way - finding similar users, at least in the Netflix case, can be better done by finding groups of people who rate movies *negatively* than by finding people who rate them positively. This happens to be very domain specific (this particular result), but I've found the phenomenon is pretty general when it comes to user ratings, and it's hard to capture using Pearson-correlation or even SVD. I don't know if it come out of RBM-based autocoder tecnhiques, or if you have to roll it in by hand in a custom gradient descent decomposer, but either way, it's something to keep in mind. On the other side, > > - portfolio approaches that increase the diversity of results presented to > the user increase the probability of user clicking, but decrease RMS score > > - dithering of results to give a changing set of recommendations increases > users click rates, but decreases RMS score > Sure, but these two approaches can be done "after the fact" on results which come from a system designed to optimize for RMS minimization. These kinds of things tend to be very domain specific, don't you think? They depend on things you know about your audience, on business factors (how much does your company care about increasing click-through versus increasing some other conversion *after* clickthrough versus just please them by showing them more different results [ie how much do you care about *decreasing* bounce rate from your main page which has the recommendations on it]). > > The take-away is that the Netflix results can't be used as a blueprint for > all recommendation needs. > This I totally agree on - not basing all recommenders on those which work well for Netflix should go without saying. But RMS is a pretty nice generic evaluation criterion (as well as pretty ubiquitous one in academic circles, so we'll have lots to compare against there), so it techniques which are good at helping this are important to look at. -jake