[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-20587: ------------------------------------ Assignee: Apache Spark (was: Nick Pentreath) > Improve performance of ML ALS recommendForAll > --------------------------------------------- > > Key: SPARK-20587 > URL: https://issues.apache.org/jira/browse/SPARK-20587 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 2.2.0 > Reporter: Nick Pentreath > Assignee: Apache Spark > > SPARK-11968 relates to excessive GC pressure from using the "blocked BLAS 3" > approach for generating top-k recommendations in > {{mllib.recommendation.MatrixFactorizationModel}}. > The solution there is still based on blocking factors, but efficiently > computes the top-k elements *per block* first (using > {{BoundedPriorityQueue}}) and then computes the global top-k elements. > This improves performance and GC pressure substantially for {{mllib}}'s ALS > model. The same approach is also a lot more efficient than the current > "crossJoin and score per-row" used in {{ml}}'s {{DataFrame}}-based method. > This adapts the solution in SPARK-11968 for {{DataFrame}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org