huaxingao opened a new pull request, #34785: URL: https://github.com/apache/spark/pull/34785
### What changes were proposed in this pull request? Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified ### Why are the changes needed? When doing repartition in distribution and sort, we will use Rebalance operator instead of RepartitionByExpression to optimize skewed partitions when 1. numPartitions is not specified by the data source, and 2. sortOrder is specified. This is because the requested distribution needs to be guaranteed, which can only be achieved by using RangePartitioning, not HashPartitioning. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing and new tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org