This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


    from f6c4e58b85d [SPARK-40407][SQL] Fix the potential data skew caused by 
df.repartition
     add 08678456d16 [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/ml/recommendation/ALS.scala   |  18 ++--
 .../ml/recommendation/TopByKeyAggregator.scala     |  59 -----------
 .../spark/ml/recommendation/CollectTopKSuite.scala | 111 +++++++++++++++++++++
 .../recommendation/TopByKeyAggregatorSuite.scala   |  73 --------------
 .../catalyst/expressions/aggregate/collect.scala   |  46 ++++++++-
 .../scala/org/apache/spark/sql/functions.scala     |   3 +
 6 files changed, 169 insertions(+), 141 deletions(-)
 delete mode 100644 
mllib/src/main/scala/org/apache/spark/ml/recommendation/TopByKeyAggregator.scala
 create mode 100644 
mllib/src/test/scala/org/apache/spark/ml/recommendation/CollectTopKSuite.scala
 delete mode 100644 
mllib/src/test/scala/org/apache/spark/ml/recommendation/TopByKeyAggregatorSuite.scala


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to