This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 9d4d41c43f1c [SPARK-46760][SQL][DOCS] Make the document of spark.sql.adaptive.coalescePartitions.parallelismFirst clearer 9d4d41c43f1c is described below commit 9d4d41c43f1cb4cf724e0e27c1762df8bbdf2a54 Author: beliefer <belie...@163.com> AuthorDate: Sat Feb 3 09:06:38 2024 -0600 [SPARK-46760][SQL][DOCS] Make the document of spark.sql.adaptive.coalescePartitions.parallelismFirst clearer ### What changes were proposed in this pull request? This PR propose to make the document of `spark.sql.adaptive.coalescePartitions.parallelismFirst` clearer. ### Why are the changes needed? The default value of `spark.sql.adaptive.coalescePartitions.parallelismFirst` is true, but the document contains the word `recommended to set this config to false and respect the configured target size`. It's very confused. ### Does this PR introduce _any_ user-facing change? 'Yes'. The document is more clear. ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? 'No'. Closes #44787 from beliefer/SPARK-46760. Authored-by: beliefer <belie...@163.com> Signed-off-by: Sean Owen <sro...@gmail.com> --- docs/sql-performance-tuning.md | 2 +- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/sql-performance-tuning.md b/docs/sql-performance-tuning.md index 1dbe1bb7e1a2..25c22d660562 100644 --- a/docs/sql-performance-tuning.md +++ b/docs/sql-performance-tuning.md @@ -267,7 +267,7 @@ This feature coalesces the post shuffle partitions based on the map output stati <td><code>spark.sql.adaptive.coalescePartitions.parallelismFirst</code></td> <td>true</td> <td> - When true, Spark ignores the target size specified by <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code> (default 64MB) when coalescing contiguous shuffle partitions, and only respect the minimum partition size specified by <code>spark.sql.adaptive.coalescePartitions.minPartitionSize</code> (default 1MB), to maximize the parallelism. This is to avoid performance regression when enabling adaptive query execution. It's recommended to set this config to false and respect th [...] + When true, Spark ignores the target size specified by <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code> (default 64MB) when coalescing contiguous shuffle partitions, and only respect the minimum partition size specified by <code>spark.sql.adaptive.coalescePartitions.minPartitionSize</code> (default 1MB), to maximize the parallelism. This is to avoid performance regressions when enabling adaptive query execution. It's recommended to set this config to true on a busy clus [...] </td> <td>3.2.0</td> </tr> diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index d88cbed6b27d..1bff0ff1a350 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -713,8 +713,9 @@ object SQLConf { "shuffle partitions, but adaptively calculate the target size according to the default " + "parallelism of the Spark cluster. The calculated size is usually smaller than the " + "configured target size. This is to maximize the parallelism and avoid performance " + - "regression when enabling adaptive query execution. It's recommended to set this config " + - "to false and respect the configured target size.") + "regressions when enabling adaptive query execution. It's recommended to set this " + + "config to true on a busy cluster to make resource utilization more efficient (not many " + + "small tasks).") .version("3.2.0") .booleanConf .createWithDefault(true) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org