[jira] [Assigned] (SPARK-20587) Improve performance of ML ALS recommendForAll
[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20587: Assignee: Nick Pentreath (was: Apache Spark) > Improve performance of ML ALS recommendForAll > - > > Key: SPARK-20587 > URL: https://issues.apache.org/jira/browse/SPARK-20587 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.2.0 >Reporter: Nick Pentreath >Assignee: Nick Pentreath > > SPARK-11968 relates to excessive GC pressure from using the "blocked BLAS 3" > approach for generating top-k recommendations in > {{mllib.recommendation.MatrixFactorizationModel}}. > The solution there is still based on blocking factors, but efficiently > computes the top-k elements *per block* first (using > {{BoundedPriorityQueue}}) and then computes the global top-k elements. > This improves performance and GC pressure substantially for {{mllib}}'s ALS > model. The same approach is also a lot more efficient than the current > "crossJoin and score per-row" used in {{ml}}'s {{DataFrame}}-based method. > This adapts the solution in SPARK-11968 for {{DataFrame}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-20587) Improve performance of ML ALS recommendForAll
[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20587: Assignee: Apache Spark (was: Nick Pentreath) > Improve performance of ML ALS recommendForAll > - > > Key: SPARK-20587 > URL: https://issues.apache.org/jira/browse/SPARK-20587 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.2.0 >Reporter: Nick Pentreath >Assignee: Apache Spark > > SPARK-11968 relates to excessive GC pressure from using the "blocked BLAS 3" > approach for generating top-k recommendations in > {{mllib.recommendation.MatrixFactorizationModel}}. > The solution there is still based on blocking factors, but efficiently > computes the top-k elements *per block* first (using > {{BoundedPriorityQueue}}) and then computes the global top-k elements. > This improves performance and GC pressure substantially for {{mllib}}'s ALS > model. The same approach is also a lot more efficient than the current > "crossJoin and score per-row" used in {{ml}}'s {{DataFrame}}-based method. > This adapts the solution in SPARK-11968 for {{DataFrame}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org