This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git
The following commit(s) were added to refs/heads/asf-site by this push:
new f8eac5ed Publish built docs triggered by
4ede2144316eccbe562078066ef9be8ca1deeeae
f8eac5ed is described below
commit f8eac5edf204c87cb7d616bbcaa9b6d735dbf93a
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Sep 16 17:32:28 2024 +0000
Publish built docs triggered by 4ede2144316eccbe562078066ef9be8ca1deeeae
---
_sources/user-guide/configs.md.txt | 1 +
searchindex.js | 2 +-
user-guide/configs.html | 38 +++++++++++++++++++++-----------------
3 files changed, 23 insertions(+), 18 deletions(-)
diff --git a/_sources/user-guide/configs.md.txt
b/_sources/user-guide/configs.md.txt
index 1b5fe736..ff2db342 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -54,6 +54,7 @@ Comet provides the following configuration settings.
| spark.comet.exec.shuffle.enabled | Whether to enable Comet native shuffle.
Note that this requires setting 'spark.shuffle.manager' to
'org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager'.
'spark.shuffle.manager' must be set before starting the Spark application and
cannot be changed during the application. | true |
| spark.comet.exec.sort.enabled | Whether to enable sort by default. | true |
| spark.comet.exec.sortMergeJoin.enabled | Whether to enable sortMergeJoin by
default. | true |
+| spark.comet.exec.sortMergeJoinWithJoinFilter.enabled | Experimental support
for Sort Merge Join with filter | false |
| spark.comet.exec.stddev.enabled | Whether to enable stddev by default.
stddev is slower than Spark's implementation. | true |
| spark.comet.exec.takeOrderedAndProject.enabled | Whether to enable
takeOrderedAndProject by default. | true |
| spark.comet.exec.union.enabled | Whether to enable union by default. | true |
diff --git a/searchindex.js b/searchindex.js
index 3b1fb6e9..3cbbe3f7 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Install Comet": [[9, "install-comet"]], "2.
Clone Spark and Apply Diff": [[9, "clone-spark-and-apply-diff"]], "3. Run Spark
SQL Tests": [[9, "run-spark-sql-tests"]], "ANSI mode": [[11, "ansi-mode"]],
"API Differences Between Spark Versions": [[0,
"api-differences-between-spark-versions"]], "ASF Links": [[10, null]], "Adding
Spark-side Tests for the New Expression": [[0,
"adding-spark-side-tests-for-the-new-expression"]], "Adding a New Expression":
[[0, [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Install Comet": [[9, "install-comet"]], "2.
Clone Spark and Apply Diff": [[9, "clone-spark-and-apply-diff"]], "3. Run Spark
SQL Tests": [[9, "run-spark-sql-tests"]], "ANSI mode": [[11, "ansi-mode"]],
"API Differences Between Spark Versions": [[0,
"api-differences-between-spark-versions"]], "ASF Links": [[10, null]], "Adding
Spark-side Tests for the New Expression": [[0,
"adding-spark-side-tests-for-the-new-expression"]], "Adding a New Expression":
[[0, [...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index 3ceedde6..88e901ec 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -447,71 +447,75 @@ under the License.
<td><p>Whether to enable sortMergeJoin by default.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.exec.stddev.enabled</p></td>
+<tr
class="row-odd"><td><p>spark.comet.exec.sortMergeJoinWithJoinFilter.enabled</p></td>
+<td><p>Experimental support for Sort Merge Join with filter</p></td>
+<td><p>false</p></td>
+</tr>
+<tr class="row-even"><td><p>spark.comet.exec.stddev.enabled</p></td>
<td><p>Whether to enable stddev by default. stddev is slower than Spark’s
implementation.</p></td>
<td><p>true</p></td>
</tr>
-<tr
class="row-even"><td><p>spark.comet.exec.takeOrderedAndProject.enabled</p></td>
+<tr
class="row-odd"><td><p>spark.comet.exec.takeOrderedAndProject.enabled</p></td>
<td><p>Whether to enable takeOrderedAndProject by default.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.exec.union.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.exec.union.enabled</p></td>
<td><p>Whether to enable union by default.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.exec.window.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.exec.window.enabled</p></td>
<td><p>Whether to enable window by default.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.explain.native.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.explain.native.enabled</p></td>
<td><p>When this setting is enabled, Comet will provide a tree representation
of the native query plan before execution and again after execution, with
metrics.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.explain.verbose.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.explain.verbose.enabled</p></td>
<td><p>When this setting is enabled, Comet will provide a verbose tree
representation of the extended information.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.explainFallback.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.explainFallback.enabled</p></td>
<td><p>When this setting is enabled, Comet will provide logging explaining the
reason(s) why a query stage cannot be executed natively. Set this to false to
reduce the amount of logging.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.memory.overhead.factor</p></td>
+<tr class="row-odd"><td><p>spark.comet.memory.overhead.factor</p></td>
<td><p>Fraction of executor memory to be allocated as additional non-heap
memory per executor process for Comet. Default value is 0.2.</p></td>
<td><p>0.2</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.memory.overhead.min</p></td>
+<tr class="row-even"><td><p>spark.comet.memory.overhead.min</p></td>
<td><p>Minimum amount of additional memory to be allocated per executor
process for Comet, in MiB.</p></td>
<td><p>402653184b</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.nativeLoadRequired</p></td>
+<tr class="row-odd"><td><p>spark.comet.nativeLoadRequired</p></td>
<td><p>Whether to require Comet native library to load successfully when Comet
is enabled. If not, Comet will silently fallback to Spark when it fails to load
the native lib. Otherwise, an error will be thrown and the Spark job will be
aborted.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.parquet.enable.directBuffer</p></td>
+<tr class="row-even"><td><p>spark.comet.parquet.enable.directBuffer</p></td>
<td><p>Whether to use Java direct byte buffer when reading Parquet. By
default, this is false</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.regexp.allowIncompatible</p></td>
+<tr class="row-odd"><td><p>spark.comet.regexp.allowIncompatible</p></td>
<td><p>Comet is not currently fully compatible with Spark for all regular
expressions. Set this config to true to allow them anyway using Rust’s regular
expression engine. See compatibility guide for more information.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.scan.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.scan.enabled</p></td>
<td><p>Whether to enable native scans. When this is turned on, Spark will use
Comet to read supported data sources (currently only Parquet is supported
natively). Note that to enable native vectorized execution, both this config
and ‘spark.comet.exec.enabled’ need to be enabled. By default, this config is
true.</p></td>
<td><p>true</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.scan.preFetch.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.scan.preFetch.enabled</p></td>
<td><p>Whether to enable pre-fetching feature of CometScan. By default is
disabled.</p></td>
<td><p>false</p></td>
</tr>
-<tr class="row-odd"><td><p>spark.comet.scan.preFetch.threadNum</p></td>
+<tr class="row-even"><td><p>spark.comet.scan.preFetch.threadNum</p></td>
<td><p>The number of threads running pre-fetching for CometScan. Effective if
spark.comet.scan.preFetch.enabled is enabled. By default it is 2. Note that
more pre-fetching threads means more memory requirement to store pre-fetched
row groups.</p></td>
<td><p>2</p></td>
</tr>
-<tr class="row-even"><td><p>spark.comet.shuffle.preferDictionary.ratio</p></td>
+<tr class="row-odd"><td><p>spark.comet.shuffle.preferDictionary.ratio</p></td>
<td><p>The ratio of total values to distinct values in a string column to
decide whether to prefer dictionary encoding when shuffling the column. If the
ratio is higher than this config, dictionary encoding will be used on shuffling
string column. This config is effective if it is higher than 1.0. By default,
this config is 10.0. Note that this config is only used when <code
class="docutils literal notranslate"><span
class="pre">spark.comet.exec.shuffle.mode</span></code> is <code class= [...]
<td><p>10.0</p></td>
</tr>
-<tr
class="row-odd"><td><p>spark.comet.sparkToColumnar.supportedOperatorList</p></td>
+<tr
class="row-even"><td><p>spark.comet.sparkToColumnar.supportedOperatorList</p></td>
<td><p>A comma-separated list of operators that will be converted to Arrow
columnar format when ‘spark.comet.sparkToColumnar.enabled’ is true</p></td>
<td><p>Range,InMemoryTableScan</p></td>
</tr>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]