[GitHub] [spark] beliefer commented on a change in pull request #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc

GitBox Wed, 01 Apr 2020 23:13:56 -0700

beliefer commented on a change in pull request #28096: 
[SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
URL: https://github.com/apache/spark/pull/28096#discussion_r402073280


 ##########
 File path: docs/sql-performance-tuning.md
 ##########
 @@ -193,34 +200,38 @@ Adaptive Query Execution (AQE) is an optimization 
technique in Spark SQL that ma
 ### Coalescing Post Shuffle Partitions
 This feature coalesces the post shuffle partitions based on the map output 
statistics when both `spark.sql.adaptive.enabled` and 
`spark.sql.adaptive.coalescePartitions.enabled` configurations are true. This 
feature simplifies the tuning of shuffle partition number when running queries. 
You do not need to set a proper shuffle partition number to fit your dataset. 
Spark can pick the proper shuffle partition number at runtime once you set a 
large enough initial number of shuffle partitions via 
`spark.sql.adaptive.coalescePartitions.initialPartitionNum` configuration.
  <table class="table">
-   <tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
+   <tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
    <tr>
      <td><code>spark.sql.adaptive.coalescePartitions.enabled</code></td>
      <td>true</td>
      <td>
        When true and <code>spark.sql.adaptive.enabled</code> is true, Spark 
will coalesce contiguous shuffle partitions according to the target size 
(specified by <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code>), to 
avoid too many small tasks.
      </td>
+     <td>3.0.0</td>
    </tr>
    <tr>
      
<td><code>spark.sql.adaptive.coalescePartitions.minPartitionNum</code></td>
      <td>Default Parallelism</td>
      <td>
        The minimum number of shuffle partitions after coalescing. If not set, 
the default value is the default parallelism of the Spark cluster. This 
configuration only has an effect when <code>spark.sql.adaptive.enabled</code> 
and <code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
      </td>
+     <td>3.0.0</td>
    </tr>
    <tr>
      
<td><code>spark.sql.adaptive.coalescePartitions.initialPartitionNum</code></td>
      <td>200</td>
      <td>
        The initial number of shuffle partitions before coalescing. By default 
it equals to <code>spark.sql.shuffle.partitions</code>. This configuration only 
has an effect when <code>spark.sql.adaptive.enabled</code> and 
<code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
      </td>
+     <td>3.0.0</td>
    </tr>
    <tr>
      <td><code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code></td>
      <td>64 MB</td>
      <td>
        The advisory size in bytes of the shuffle partition during adaptive 
optimization (when <code>spark.sql.adaptive.enabled</code> is true). It takes 
effect when Spark coalesces small shuffle partitions or splits skewed shuffle 
partition.
      </td>
+     <td>3.0.0</td>
 
 Review comment:
   SPARK-31037, commit ID: 
46b7f1796bd0b96977ce9b473601033f397a3b18#diff-9a6b543db706f1a90f790783d6930a13

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc

Reply via email to