Lisen, did you use all m-by-n pairs during training? Implicit model
penalizes unobserved ratings, while explicit model doesn't. -Xiangrui

On Feb 26, 2015 6:26 AM, "Sean Owen" <so...@cloudera.com> wrote:
>
> +user
>
> On Thu, Feb 26, 2015 at 2:26 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> I think I may have it backwards, and that you are correct to keep the 0
elements in train() in order to try to reproduce the same result.
>>
>> The second formulation is called 'weighted regularization' and is used
for both implicit and explicit feedback, as far as I can see in the code.
>>
>> Hm, I'm actually not clear why these would produce different results.
Different code paths are used to be sure, but I'm not yet sure why they
would give different results.
>>
>> In general you wouldn't use train() for data like this though, and would
never set alpha=0.
>>
>> On Thu, Feb 26, 2015 at 2:15 PM, lisendong <lisend...@163.com> wrote:
>>>
>>> I want to confirm the loss function you use (sorry I’m not so familiar
with scala code so I did not understand the source code of mllib)
>>>
>>> According to the papers :
>>>
>>>
>>> in your implicit feedback ALS, the loss function is (ICDM 2008):
>>>
>>> in the explicit feedback ALS, the loss function is (Netflix 2008):
>>>
>>> note that besides the difference of confidence parameter Cui,
the regularization is also different.  does your code also has this
difference?
>>>
>>> Best Regards,
>>> Sendong Li
>>>
>>>
>>>> 在 2015年2月26日,下午9:42,lisendong <lisend...@163.com> 写道:
>>>>
>>>> Hi meng, fotero, sowen:
>>>>
>>>> I’m using ALS with spark 1.0.0, the code should be:
>>>>
https://github.com/apache/spark/blob/branch-1.0/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
>>>>
>>>> I think the following two method should produce the same (or near)
result:
>>>>
>>>> MatrixFactorizationModel model = ALS.train(ratings.rdd(), 30, 30,
0.01, -1, 1);
>>>>
>>>> MatrixFactorizationModel model = ALS.trainImplicit(ratings.rdd(), 30,
30, 0.01, -1, 0, 1);
>>>>
>>>> the data I used is display log, the format of log is as following:
>>>>
>>>> user  item  if-click
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I use 1.0 as score for click pair, and 0 as score for non-click pair.
>>>>
>>>>  in the second method, the alpha is set to zero, so the confidence for
positive and negative are both 1.0 (right?)
>>>>
>>>> I think the two method should produce similar result, but the result
is :  the second method’s result is very bad (the AUC of the first result
is 0.7, but the AUC of the second result is only 0.61)
>>>>
>>>>
>>>> I could not understand why, could you help me?
>>>>
>>>>
>>>> Thank you very much!
>>>>
>>>> Best Regards,
>>>> Sendong Li
>>>
>>>
>>
>

Reply via email to