Manu Zhang created SPARK-32698: ---------------------------------- Summary: Do not fall back to default parallelism if the minimum number of coalesced partitions is not set in AQE Key: SPARK-32698 URL: https://issues.apache.org/jira/browse/SPARK-32698 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: Manu Zhang
Currently in AQE when coalescing shuffling partitions, {quote}We fall back to Spark default parallelism if the minimum number of coalesced partitions is not set, so to avoid perf regressions compared to no coalescing. {quote} >From our experience, this has resulted in a lot of uncertainty of the number >of tasks after coalescing especially with dynamic allocation, and also lead to >many small output files. It's complex and hard to reason about. Hence, I'm proposing not falling back to the default parallelism but coalescing towards the target size when the minimum number of coalesced partitions is not set. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org