XiDuo You created SPARK-38410:
---------------------------------

             Summary: Support specify initial partition number for rebalance
                 Key: SPARK-38410
                 URL: https://issues.apache.org/jira/browse/SPARK-38410
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: XiDuo You


Rebalance partitions resolve the skew issue during shuffle dataset. It always 
returns an indeterminate partition number so at the beginning we do not pass 
partition as parameter.

 

However, we find the initial partition number can affect the data compression 
ratio. So it would be better to make the partition number isolation.

 

Note that, it only affects the initial partition number at map side during 
shuffle.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to