[ https://issues.apache.org/jira/browse/SPARK-37357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
XiDuo You updated SPARK-37357: ------------------------------ Summary: Add small partition factor for rebalance partitions (was: Create skew partition specs should respect min partition size) > Add small partition factor for rebalance partitions > --------------------------------------------------- > > Key: SPARK-37357 > URL: https://issues.apache.org/jira/browse/SPARK-37357 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.3.0 > Reporter: XiDuo You > Priority: Major > > For example `Rebalance` provide a functionality that split the large reduce > partition into smalls. However we have seen many SQL produce small files due > to the last partition. > Let's say we have one reduce partition and six map partitions and the blocks > are: > [10, 10, 10, 10, 10, 10] > If the target size is 50, we will get two files with 50 and 10. And it will > get worse if there are thousands of reduce partitions. > It should be helpful if we can control the min partition size. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org