The mahout implementation is just a straight-forward port of the paper. No changes have been made.

On 03/12/2014 08:36 AM, Nick Pentreath wrote:
It would be helpful to know what parameter inputs you are using.

If the regularization schemes are different (by a factor of alpha, which
can often be quite high) this will mean that the same parameter settings
could give very different results. A higher lambda would be required with
Spark's version to be comparable.

When I submitted the PR for this, I verified (on ml-100k, ml-1m and ml-10m
data) that this version gives the same RMSE as Mahout's implicit model, as
well as a separate Spark version that I wrote that was a from-scratch port
of the Mahout algorithm (though I didn't compare vs Myrrix/Oryx). I'm
fairly confident things are correct but if there is a bug let's definitely
find and fix it!

@Sean, would it be a good idea to look at changing the regularization in
Spark's ALS to alpha * lambda? What is the thinking behind this? If I
recall, the Mahout version added something like (# ratings * lambda) as
regularization in each factor update (for explicit), but implicit it was
just lambda (I may be wrong here).



On Wed, Mar 12, 2014 at 4:57 AM, Xiangrui Meng <men...@gmail.com> wrote:

Line 376 should be correct as it is computing \sum_i (c_i - 1) x_i
x_i^T, = \sum_i (alpha * r_i) x_i x_i^T. Are you computing some
metrics to tell which recommendation is better? -Xiangrui

On Tue, Mar 11, 2014 at 6:38 PM, Xiangrui Meng <men...@gmail.com> wrote:
Hi Michael,

I can help check the current implementation. Would you please go to
https://spark-project.atlassian.net/browse/SPARK and create a ticket
about this issue with component "MLlib"? Thanks!

Best,
Xiangrui

On Tue, Mar 11, 2014 at 3:18 PM, Michael Allman <m...@allman.ms> wrote:
Hi,

I'm implementing a recommender based on the algorithm described in
http://www2.research.att.com/~yifanhu/PUB/cf.pdf. This algorithm forms
the
basis for Spark's ALS implementation for data sets with implicit
features.
The data set I'm working with is proprietary and I cannot share it,
however
I can say that it's based on the same kind of data in the
paper---relative
viewing time of videos. (Specifically, the "rating" for each video is
defined as total viewing time across all visitors divided by video
duration).

I'm seeing counterintuitive, sometimes nonsensical recommendations. For
comparison, I've run the training data through Oryx's in-VM
implementation
of implicit ALS with the same parameters. Oryx uses the same algorithm.
(Source in this file:

https://github.com/cloudera/oryx/blob/master/als-common/src/main/java/com/cloudera/oryx/als/common/factorizer/als/AlternatingLeastSquares.java
)

The recommendations made by each system compared to one other are very
different---moreso than I think could be explained by differences in
initial
state. The recommendations made by the Oryx models look much better,
especially as I increase the number of latent factors and the
iterations.
The Spark models' recommendations don't improve with increases in either
latent factors or iterations. Sometimes, they get worse.

Because of the (understandably) highly-optimized and terse style of
Spark's
ALS implementation, I've had a very hard time following it well enough
to
debug the issue definitively. However, I have found a section of code
that
looks incorrect. As described in the paper, part of the implicit ALS
algorithm involves computing a matrix product YtCuY (equation 4 in the
paper). To optimize this computation, this expression is rewritten as
YtY +
Yt(Cu - I)Y. I believe that's what should be happening here:


https://github.com/apache/incubator-spark/blob/v0.9.0-incubating/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala#L376

However, it looks like this code is in fact computing YtY + YtY(Cu - I),
which is the same as YtYCu. If so, that's a bug. Can someone familiar
with
this code evaluate my claim?

Cheers,

Michael



Reply via email to