Hi!

We are using Spark mllib (on Spark 3.2.0) ALS Model for an implicit
feedback based collaborative filtering recommendation job. While looking at
the output of recommendForUserSubset
<https://spark.apache.org/docs/latest/api/scala/org/apache/spark/ml/recommendation/ALSModel.html#recommendForUserSubset(dataset:org.apache.spark.sql.Dataset[_],numItems:Int):org.apache.spark.sql.DataFrame>
, we found duplicate itemId, rating pairs. The documentation isn't clear on
whether duplicate pairs can appear in the recommendation outputs. We are
currently deduplicating the results as a post processing step. But I wanted
to confirm whether duplicates are expected here in the first place or
whether there's some issue with our usage that is causing buggy results?

Thanks in advance!

Regards.

Reply via email to