[GitHub] [spark] zhengruifeng commented on pull request #37918: [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

2022-09-22 Thread GitBox
zhengruifeng commented on PR #37918: URL: https://github.com/apache/spark/pull/37918#issuecomment-1255672550 Thanks for the reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] zhengruifeng commented on pull request #37918: [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

2022-09-19 Thread GitBox
zhengruifeng commented on PR #37918: URL: https://github.com/apache/spark/pull/37918#issuecomment-1250789841 cc @srowen @WeichenXu123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #37918: [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

2022-09-18 Thread GitBox
zhengruifeng commented on PR #37918: URL: https://github.com/apache/spark/pull/37918#issuecomment-1250409906 @dongjoon-hyun > could you make an independent PR moving TopByKeyAggregator to CollectTopK because that is orthogonal from Reduce the shuffle size of ALS? It is just th

[GitHub] [spark] zhengruifeng commented on pull request #37918: [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

2022-09-16 Thread GitBox
zhengruifeng commented on PR #37918: URL: https://github.com/apache/spark/pull/37918#issuecomment-1249957261 take the [`ALSExample`](https://github.com/apache/spark/blob/e1ea806b3075d279b5f08a29fe4c1ad6d3c4191a/examples/src/main/scala/org/apache/spark/examples/ml/ALSExample.scala) for examp