[GitHub] [spark] huaxingao commented on pull request #34785: [SPARK-37523][SQL] Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2022-05-23 Thread GitBox
huaxingao commented on PR #34785: URL: https://github.com/apache/spark/pull/34785#issuecomment-1135252175 It's a bit too hard to rebase. I will close this PR and the new one is here https://github.com/apache/spark/pull/36644. Thank you all very much for reviewing! -- This is an

[GitHub] [spark] huaxingao commented on pull request #34785: [SPARK-37523][SQL] Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2022-05-19 Thread GitBox
huaxingao commented on PR #34785: URL: https://github.com/apache/spark/pull/34785#issuecomment-1132364552 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] huaxingao commented on pull request #34785: [SPARK-37523][SQL] Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2022-05-19 Thread GitBox
huaxingao commented on PR #34785: URL: https://github.com/apache/spark/pull/34785#issuecomment-1132363307 Thanks @aokolnychyi for the proposal. I agree that we should support both strictly required distribution and best effort distribution. For best effort distribution, if user doesn't