XiDuo You created SPARK-38410: --------------------------------- Summary: Support specify initial partition number for rebalance Key: SPARK-38410 URL: https://issues.apache.org/jira/browse/SPARK-38410 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: XiDuo You
Rebalance partitions resolve the skew issue during shuffle dataset. It always returns an indeterminate partition number so at the beginning we do not pass partition as parameter. However, we find the initial partition number can affect the data compression ratio. So it would be better to make the partition number isolation. Note that, it only affects the initial partition number at map side during shuffle. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org