On Sat, Nov 28, 2009 at 10:57 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Restricted Boltzmann are of real interest, but again, I repeat the
> obligatory warning about replicating all things from the Netflix
> competition.
>

Totally agree in principle, but I'm not saying everyone should use the best
techniques from the Netflix challenge, at it is, as you say, just minimizing
RMS over a particular recommendation type (one where popularity matters
a lot, blockbusters are a big effect, etc...), but I have a feeling that as
Mahout
gets more popular, people are going to be wanting to try out Netflix
techniques (whether they work on their data sets or not!), and will be
asking
whether we have them.


> To take a few concrete examples,
>
> - user biases were a huge advance in terms of RMS error, but they don't
> affect the ordering of the results presented to a user and thus are of no
> interest for almost all production recommender applications
>

Do you really think that correcting for user-bias doesn't affect ordering of
results?  The simplest possible thing (just making sure the final predicted
ratings for a given user take into account the user's bias) certainly
doesn't
affect the ordering, but even this affects your judgement of whether the
user will "like" the movie, right?

More importantly, there were other techniques regarding user-bias which
I seem to recall which are good for not just RMS, but ordering effects:
the fact that there is a nonlinear response in ratings - if a user has an
average rating of 3 (or we've adjusted for their bias), then it usually
turns out that a rating of 1 implies they hate said item *far* more than
the amount to which they rate a 5 means they love it (ie. user bias is
not just a centering effect).

To put it another way - finding similar users, at least in the Netflix case,

can be better done by finding groups of people who rate movies
*negatively* than by finding people who rate them positively.  This
happens to be very domain specific (this particular result), but I've found
the phenomenon is pretty general when it comes to user ratings, and it's
hard to capture using Pearson-correlation or even SVD.  I don't know if
it come out of RBM-based autocoder tecnhiques, or if you have to roll
it in by hand in a custom gradient descent decomposer, but either way,
it's something to keep in mind.

On the other side,
>
> - portfolio approaches that increase the diversity of results presented to
> the user increase the probability of user clicking, but decrease RMS score
>
> - dithering of results to give a changing set of recommendations increases
> users click rates, but decreases RMS score
>

Sure, but these two approaches can be done "after the fact" on results
which come from a system designed to optimize for RMS minimization.
These kinds of things tend to be very domain specific, don't you think?
They
depend on things you know about your audience, on business factors (how
much does your company care about increasing click-through versus
increasing some other conversion *after* clickthrough versus just please
them by showing them more different results [ie how much do you care about
*decreasing* bounce rate from your main page which has the recommendations
on it]).


>
> The take-away is that the Netflix results can't be used as a blueprint for
> all recommendation needs.
>

This I totally agree on - not basing all recommenders on those which work
well for Netflix should go without saying.  But RMS is a pretty nice generic

evaluation criterion (as well as pretty ubiquitous one in academic circles,
so
we'll have lots to compare against there), so it techniques which are good
at helping this are important to look at.

  -jake

Reply via email to