[ https://issues.apache.org/jira/browse/SPARK-24652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jerry Lam resolved SPARK-24652. ------------------------------- Resolution: Not A Problem > Strange ALS Implementation for Implicit Feedback > ------------------------------------------------ > > Key: SPARK-24652 > URL: https://issues.apache.org/jira/browse/SPARK-24652 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 2.3.1 > Reporter: Jerry Lam > Priority: Major > > Hi there, > I'm evaluating the ALS implementation from Spark ML. Does Spark implement the > algorithm described in "Collaborative Filtering for Implicit Feedback > Datasets"? because if it is, I think the implementation returns result that > is incorrect. > Here is the example: > {code:java} > from pyspark.ml.recommendation import ALS > als = ALS( > maxIter=100, > regParam=0.0, > alpha=1.0, > nonnegative=False, > implicitPrefs=True, > rank=1) > ratings = spark.createDataFrame([(0, 0, 1), (1,1, 1)]).toDF('user', 'item', > 'rating') > als_model = als.fit(ratings) > reco = als_model.recommendForAllUsers(10) > reco.show(truncate=False) > {code} > The result is: > {code:java} > +----+---------------------------------+ > |user|recommendations | > +----+---------------------------------+ > |0 |[[0, 0.6666667], [1, -0.6666667]]| > |1 |[[1, 0.6666667], [0, -0.6666667]]| > +----+---------------------------------+ > {code} > I expect the results for the above to be : > {code:java} > +----+---------------------------------+ > |user|recommendations | > +----+---------------------------------+ > |0 |[[0, 1.0], [1, -1.0]]| > |1 |[[1, 1.0], [0, -1.0]]| > +----+---------------------------------+ > {code} > The reason I believe that it should be equal to 1.0 for (user=1, item=1) and > 1.0 for (user=0, item=0) is because from the paper, the above should return > 1.0 this two cases given that lambda is 0.0 (no regularization). > > Can someone describe what implementation of implicit feedback is spark using? > If it implemented the same paper, why the result is so different? Thank you. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org