This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 7d89a4bc95 Publish built docs triggered by
c98fa5616e70ab70579d320bdd68d9b5ca1ead3e
7d89a4bc95 is described below
commit 7d89a4bc95eae2ad58a6a1fe8ab1857310011063
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Jan 9 10:09:45 2026 +0000
Publish built docs triggered by c98fa5616e70ab70579d320bdd68d9b5ca1ead3e
---
_sources/user-guide/configs.md.txt | 2 ++
searchindex.js | 2 +-
user-guide/configs.html | 8 ++++++++
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/_sources/user-guide/configs.md.txt
b/_sources/user-guide/configs.md.txt
index c9222afe8c..b59af0c13d 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -74,6 +74,8 @@ The following configuration settings are available:
| datafusion.catalog.has_header |
true | Default value for `format.has_header` for `CREATE
EXTERNAL TABLE` if not specified explicitly in the statement.
[...]
| datafusion.catalog.newlines_in_values |
false | Specifies whether newlines in (quoted) CSV values
are supported. This is the default value for `format.newlines_in_values` for
`CREATE EXTERNAL TABLE` if not specified explicitly in the statement. Parsing
newlines in quoted values may be affected by execution behaviour such as
parallel file scanning. Setting this to `true` ensures that newlines in values
are parsed successfully, which [...]
| datafusion.execution.batch_size |
8192 | Default batch size while creating new batches, it's
especially useful for buffer-in-memory batches since creating tiny batches
would result in too much metadata memory consumption
[...]
+| datafusion.execution.perfect_hash_join_small_build_threshold |
1024 | A perfect hash join (see `HashJoinExec` for more
details) will be considered if the range of keys (max - min) on the build side
is < this threshold. This provides a fast path for joins with very small key
ranges, bypassing the density check. Currently only supports cases where
build_side.num_rows() < u32::MAX. Support for build_side.num_rows() >= u32::MAX
will be added in the future. [...]
+| datafusion.execution.perfect_hash_join_min_key_density |
0.15 | The minimum required density of join keys on the
build side to consider a perfect hash join (see `HashJoinExec` for more
details). Density is calculated as: `(number of rows) / (max_key - min_key +
1)`. A perfect hash join may be used if the actual key density > this value.
Currently only supports cases where build_side.num_rows() < u32::MAX. Support
for build_side.num_rows() >= u32::M [...]
| datafusion.execution.coalesce_batches |
true | When set to true, record batches will be examined
between each operator and small batches will be coalesced into larger batches.
This is helpful when there are highly selective filters or joins that could
produce tiny output batches. The target batch size is determined by the
configuration setting
[...]
| datafusion.execution.collect_statistics |
true | Should DataFusion collect statistics when first
creating a table. Has no effect after the table is created. Applies to the
default `ListingTableProvider` in DataFusion. Defaults to true.
[...]
| datafusion.execution.target_partitions | 0
| Number of partitions for query execution. Increasing
partitions can increase concurrency. Defaults to the number of CPU cores on the
system
[...]
diff --git a/searchindex.js b/searchindex.js
index 9dff39054e..7397982bde 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
[...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
[...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index f7652e756c..59245d58eb 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -482,6 +482,14 @@ example, to configure <code class="docutils literal
notranslate"><span class="pr
<td><p>8192</p></td>
<td><p>Default batch size while creating new batches, it’s especially useful
for buffer-in-memory batches since creating tiny batches would result in too
much metadata memory consumption</p></td>
</tr>
+<tr
class="row-odd"><td><p>datafusion.execution.perfect_hash_join_small_build_threshold</p></td>
+<td><p>1024</p></td>
+<td><p>A perfect hash join (see <code class="docutils literal
notranslate"><span class="pre">HashJoinExec</span></code> for more details)
will be considered if the range of keys (max - min) on the build side is <
this threshold. This provides a fast path for joins with very small key ranges,
bypassing the density check. Currently only supports cases where
build_side.num_rows() < u32::MAX. Support for build_side.num_rows() >=
u32::MAX will be added in the future.</p></td>
+</tr>
+<tr
class="row-even"><td><p>datafusion.execution.perfect_hash_join_min_key_density</p></td>
+<td><p>0.15</p></td>
+<td><p>The minimum required density of join keys on the build side to consider
a perfect hash join (see <code class="docutils literal notranslate"><span
class="pre">HashJoinExec</span></code> for more details). Density is calculated
as: <code class="docutils literal notranslate"><span class="pre">(number</span>
<span class="pre">of</span> <span class="pre">rows)</span> <span
class="pre">/</span> <span class="pre">(max_key</span> <span
class="pre">-</span> <span class="pre">min_key</span> [...]
+</tr>
<tr class="row-odd"><td><p>datafusion.execution.coalesce_batches</p></td>
<td><p>true</p></td>
<td><p>When set to true, record batches will be examined between each operator
and small batches will be coalesced into larger batches. This is helpful when
there are highly selective filters or joins that could produce tiny output
batches. The target batch size is determined by the configuration
setting</p></td>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]