Manu Zhang created SPARK-32698:
----------------------------------

             Summary: Do not fall back to default parallelism if the minimum 
number of coalesced partitions is not set in AQE
                 Key: SPARK-32698
                 URL: https://issues.apache.org/jira/browse/SPARK-32698
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Manu Zhang


Currently in AQE when coalescing shuffling partitions,
{quote}We fall back to Spark default parallelism if the minimum number of 
coalesced partitions is not set, so to avoid perf regressions compared to no 
coalescing.
{quote}
>From our experience, this has resulted in a lot of uncertainty of the number 
>of tasks after coalescing especially with dynamic allocation, and also lead to 
>many small output files. It's complex and hard to reason about.

Hence, I'm proposing not falling back to the default parallelism but coalescing 
towards the target size when the minimum number of coalesced partitions is not 
set.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to