okay, I have brought this to the user@list

I don’t think the negative pair should be omitted…..


if the score of all of the pairs are 1.0, the result will be worse…I have tried…


Best Regards, 
Sendong Li
> 在 2015年2月26日,下午10:07,Sean Owen <so...@cloudera.com> 写道:
> 
> Yes, I mean, do not generate a Rating for these data points. What then?
> 
> Also would you care to bring this to the user@ list? it's kind of interesting.
> 
> On Thu, Feb 26, 2015 at 2:02 PM, lisendong <lisend...@163.com> wrote:
>> I set the score of ‘0’ interaction user-item pair to 0.0
>> the code is as following:
>> 
>> if (ifclick > 0) {
>>    score = 1.0;
>> }
>> else {
>>    score = 0.0;
>> }
>> return new Rating(user_id, photo_id, score);
>> 
>> both method use the same ratings rdd
>> 
>> because of the same random seed(1 in my case), the result is stable.
>> 
>> 
>> Best Regards,
>> Sendong Li
>> 
>> 
>> 在 2015年2月26日,下午9:53,Sean Owen <so...@cloudera.com> 写道:
>> 
>> 
>> I see why you say that, yes.
>> 
>> Are you actually encoding the '0' interactions, or just omitting them?
>> I think you should do the latter.
>> 
>> Is the AUC stable over many runs or did you just run once?
>> 
>> On Thu, Feb 26, 2015 at 1:42 PM, lisendong <lisend...@163.com> wrote:
>> 
>> Hi meng, fotero, sowen:
>> 
>> I’m using ALS with spark 1.0.0, the code should be:
>> https://github.com/apache/spark/blob/branch-1.0/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
>> 
>> I think the following two method should produce the same (or near) result:
>> 
>> MatrixFactorizationModel model = ALS.train(ratings.rdd(), 30, 30, 0.01, -1,
>> 1);
>> 
>> MatrixFactorizationModel model = ALS.trainImplicit(ratings.rdd(), 30, 30,
>> 0.01, -1, 0, 1);
>> 
>> the data I used is display log, the format of log is as following:
>> 
>> user  item  if-click
>> 
>> 
>> 
>> 
>> 
>> 
>> I use 1.0 as score for click pair, and 0 as score for non-click pair.
>> 
>> in the second method, the alpha is set to zero, so the confidence for
>> positive and negative are both 1.0 (right?)
>> 
>> I think the two method should produce similar result, but the result is :
>> the second method’s result is very bad (the AUC of the first result is 0.7,
>> but the AUC of the second result is only 0.61)
>> 
>> 
>> I could not understand why, could you help me?
>> 
>> 
>> Thank you very much!
>> 
>> Best Regards,
>> Sendong Li
>> 
>> 



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to