[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347742#comment-14347742 ]
Joseph K. Bradley commented on SPARK-3066: ------------------------------------------ Are there approximate methods which would be faster? On single machines, there are data structures for finding approximate nearest neighbors quickly. I'm not sure about distributed data structures. > Support recommendAll in matrix factorization model > -------------------------------------------------- > > Key: SPARK-3066 > URL: https://issues.apache.org/jira/browse/SPARK-3066 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Xiangrui Meng > Assignee: Debasish Das > > ALS returns a matrix factorization model, which we can use to predict ratings > for individual queries as well as small batches. In practice, users may want > to compute top-k recommendations offline for all users. It is very expensive > but a common problem. We can do some optimization like > 1) collect one side (either user or product) and broadcast it as a matrix > 2) use level-3 BLAS to compute inner products > 3) use Utils.takeOrdered to find top-k -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org