This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git
The following commit(s) were added to refs/heads/asf-site by this push:
new cff2dc90e Publish built docs triggered by
5ec12d4024e78faa97834f88ab606ecd3d0db5fb
cff2dc90e is described below
commit cff2dc90e2e081e9f288efc9d272a7f491dd3218
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Tue Dec 16 01:29:29 2025 +0000
Publish built docs triggered by 5ec12d4024e78faa97834f88ab606ecd3d0db5fb
---
_sources/user-guide/latest/configs.md.txt | 1 +
searchindex.js | 2 +-
user-guide/latest/configs.html | 12 ++++++++----
3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/_sources/user-guide/latest/configs.md.txt
b/_sources/user-guide/latest/configs.md.txt
index 5b416f927..db7d2ce32 100644
--- a/_sources/user-guide/latest/configs.md.txt
+++ b/_sources/user-guide/latest/configs.md.txt
@@ -107,6 +107,7 @@ These settings can be used to determine which parts of the
plan are accelerated
| `spark.comet.exec.shuffle.compression.codec` | The codec of Comet native
shuffle used to compress shuffle data. lz4, zstd, and snappy are supported.
Compression can be disabled by setting spark.shuffle.compress=false. | lz4 |
| `spark.comet.exec.shuffle.compression.zstd.level` | The compression level to
use when compressing shuffle files with zstd. | 1 |
| `spark.comet.exec.shuffle.enabled` | Whether to enable Comet native shuffle.
Note that this requires setting `spark.shuffle.manager` to
`org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager`.
`spark.shuffle.manager` must be set before starting the Spark application and
cannot be changed during the application. | true |
+| `spark.comet.exec.shuffle.writeBufferSize` | Size of the write buffer in
bytes used by the native shuffle writer when writing shuffle data to disk.
Larger values may improve write performance by reducing the number of system
calls, but will use more memory. The default is 1MB which provides a good
balance between performance and memory usage. | 1048576b |
| `spark.comet.native.shuffle.partitioning.hash.enabled` | Whether to enable
hash partitioning for Comet native shuffle. | true |
| `spark.comet.native.shuffle.partitioning.range.enabled` | Whether to enable
range partitioning for Comet native shuffle. | true |
| `spark.comet.shuffle.preferDictionary.ratio` | The ratio of total values to
distinct values in a string column to decide whether to prefer dictionary
encoding when shuffling the column. If the ratio is higher than this config,
dictionary encoding will be used on shuffling string column. This config is
effective if it is higher than 1.0. Note that this config is only used when
`spark.comet.exec.shuffle.mode` is `jvm`. | 10.0 |
diff --git a/searchindex.js b/searchindex.js
index c0574860c..21725b7e3 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Install Comet": [[19, "install-comet"]],
"1. Native Operators (nativeExecs map)": [[4,
"native-operators-nativeexecs-map"]], "2. Clone Spark and Apply Diff": [[19,
"clone-spark-and-apply-diff"]], "2. Sink Operators (sinks map)": [[4,
"sink-operators-sinks-map"]], "3. Comet JVM Operators": [[4,
"comet-jvm-operators"]], "3. Run Spark SQL Tests": [[19,
"run-spark-sql-tests"]], "ANSI Mode": [[22, "ansi-mode"], [35, "ansi-mode"],
[48, "ansi-mode"], [88, "ans [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Install Comet": [[19, "install-comet"]],
"1. Native Operators (nativeExecs map)": [[4,
"native-operators-nativeexecs-map"]], "2. Clone Spark and Apply Diff": [[19,
"clone-spark-and-apply-diff"]], "2. Sink Operators (sinks map)": [[4,
"sink-operators-sinks-map"]], "3. Comet JVM Operators": [[4,
"comet-jvm-operators"]], "3. Run Spark SQL Tests": [[19,
"run-spark-sql-tests"]], "ANSI Mode": [[22, "ansi-mode"], [35, "ansi-mode"],
[48, "ansi-mode"], [88, "ans [...]
\ No newline at end of file
diff --git a/user-guide/latest/configs.html b/user-guide/latest/configs.html
index e110a2fff..f54d4c7f8 100644
--- a/user-guide/latest/configs.html
+++ b/user-guide/latest/configs.html
@@ -700,19 +700,23 @@ under the License.
<td><p>Whether to enable Comet native shuffle. Note that this requires setting
<code class="docutils literal notranslate"><span
class="pre">spark.shuffle.manager</span></code> to <code class="docutils
literal notranslate"><span
class="pre">org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager</span></code>.
<code class="docutils literal notranslate"><span
class="pre">spark.shuffle.manager</span></code> must be set before starting the
Spark application and cannot be changed dur [...]
<td><p>true</p></td>
</tr>
-<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.native.shuffle.partitioning.hash.enabled</span></code></p></td>
+<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.exec.shuffle.writeBufferSize</span></code></p></td>
+<td><p>Size of the write buffer in bytes used by the native shuffle writer
when writing shuffle data to disk. Larger values may improve write performance
by reducing the number of system calls, but will use more memory. The default
is 1MB which provides a good balance between performance and memory
usage.</p></td>
+<td><p>1048576b</p></td>
+</tr>
+<tr class="row-even"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.native.shuffle.partitioning.hash.enabled</span></code></p></td>
<td><p>Whether to enable hash partitioning for Comet native shuffle.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-even"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.native.shuffle.partitioning.range.enabled</span></code></p></td>
+<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.native.shuffle.partitioning.range.enabled</span></code></p></td>
<td><p>Whether to enable range partitioning for Comet native shuffle.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.shuffle.preferDictionary.ratio</span></code></p></td>
+<tr class="row-even"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.shuffle.preferDictionary.ratio</span></code></p></td>
<td><p>The ratio of total values to distinct values in a string column to
decide whether to prefer dictionary encoding when shuffling the column. If the
ratio is higher than this config, dictionary encoding will be used on shuffling
string column. This config is effective if it is higher than 1.0. Note that
this config is only used when <code class="docutils literal notranslate"><span
class="pre">spark.comet.exec.shuffle.mode</span></code> is <code
class="docutils literal notranslate"><s [...]
<td><p>10.0</p></td>
</tr>
-<tr class="row-even"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.shuffle.sizeInBytesMultiplier</span></code></p></td>
+<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span
class="pre">spark.comet.shuffle.sizeInBytesMultiplier</span></code></p></td>
<td><p>Comet reports smaller sizes for shuffle due to using Arrow’s columnar
memory format and this can result in Spark choosing a different join strategy
due to the estimated size of the exchange being smaller. Comet will multiple
sizeInBytes by this amount to avoid regressions in join strategy.</p></td>
<td><p>1.0</p></td>
</tr>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]