Hi again!

Ironically, soon after sending the previous email I actually found the bug
in our setup that was resulting in duplicates and it wasn't Mllib ALS after
all. Sorry for the confusion.

Regards.

On Mon, Jan 23, 2023 at 1:09 PM Kartik Ohri <kartikohr...@gmail.com> wrote:

> Hi!
>
> We are using Spark mllib (on Spark 3.2.0) ALS Model for an implicit
> feedback based collaborative filtering recommendation job. While looking at
> the output of recommendForUserSubset
> <https://spark.apache.org/docs/latest/api/scala/org/apache/spark/ml/recommendation/ALSModel.html#recommendForUserSubset(dataset:org.apache.spark.sql.Dataset[_],numItems:Int):org.apache.spark.sql.DataFrame>
> , we found duplicate itemId, rating pairs. The documentation isn't clear on
> whether duplicate pairs can appear in the recommendation outputs. We are
> currently deduplicating the results as a post processing step. But I wanted
> to confirm whether duplicates are expected here in the first place or
> whether there's some issue with our usage that is causing buggy results?
>
> Thanks in advance!
>
> Regards.
>

Reply via email to