Github user debasish83 commented on the pull request: https://github.com/apache/spark/pull/3098#issuecomment-62064318 @coderxiang I read the reference paper and I understood the issue... I thought it as regression metric before but it is not...the predicted value does not matter...the rank of the movieId from predicted set matters...I am updating the PR with following steps (this is focused on user recommendation) if the --validateRecommedation is set... 1. For every user generate train and test set using (0.8, 0.2) and use RDD.sampleByKey 2. For every user, the predicted set is of size numProducts...I am using MatrixFactorizationModel.recommendProduct(userId, numProducts) API to generate the predicted set 3. For every user, the labeled set comes from the test set as computed in Step 1 4. Once I have these two array for every user, I call RankingMetrics to call meanAveragePrecision
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org