[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995480#comment-15995480 ]
Apache Spark commented on SPARK-20587: -------------------------------------- User 'MLnick' has created a pull request for this issue: https://github.com/apache/spark/pull/17845 > Improve performance of ML ALS recommendForAll > --------------------------------------------- > > Key: SPARK-20587 > URL: https://issues.apache.org/jira/browse/SPARK-20587 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 2.2.0 > Reporter: Nick Pentreath > Assignee: Nick Pentreath > > SPARK-11968 relates to excessive GC pressure from using the "blocked BLAS 3" > approach for generating top-k recommendations in > {{mllib.recommendation.MatrixFactorizationModel}}. > The solution there is still based on blocking factors, but efficiently > computes the top-k elements *per block* first (using > {{BoundedPriorityQueue}}) and then computes the global top-k elements. > This improves performance and GC pressure substantially for {{mllib}}'s ALS > model. The same approach is also a lot more efficient than the current > "crossJoin and score per-row" used in {{ml}}'s {{DataFrame}}-based method. > This adapts the solution in SPARK-11968 for {{DataFrame}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org