This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 7d89a4bc95 Publish built docs triggered by 
c98fa5616e70ab70579d320bdd68d9b5ca1ead3e
7d89a4bc95 is described below

commit 7d89a4bc95eae2ad58a6a1fe8ab1857310011063
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Jan 9 10:09:45 2026 +0000

    Publish built docs triggered by c98fa5616e70ab70579d320bdd68d9b5ca1ead3e
---
 _sources/user-guide/configs.md.txt | 2 ++
 searchindex.js                     | 2 +-
 user-guide/configs.html            | 8 ++++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/_sources/user-guide/configs.md.txt 
b/_sources/user-guide/configs.md.txt
index c9222afe8c..b59af0c13d 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -74,6 +74,8 @@ The following configuration settings are available:
 | datafusion.catalog.has_header                                           | 
true                      | Default value for `format.has_header` for `CREATE 
EXTERNAL TABLE` if not specified explicitly in the statement.                   
                                                                                
                                                                                
                                                                                
                   [...]
 | datafusion.catalog.newlines_in_values                                   | 
false                     | Specifies whether newlines in (quoted) CSV values 
are supported. This is the default value for `format.newlines_in_values` for 
`CREATE EXTERNAL TABLE` if not specified explicitly in the statement. Parsing 
newlines in quoted values may be affected by execution behaviour such as 
parallel file scanning. Setting this to `true` ensures that newlines in values 
are parsed successfully, which  [...]
 | datafusion.execution.batch_size                                         | 
8192                      | Default batch size while creating new batches, it's 
especially useful for buffer-in-memory batches since creating tiny batches 
would result in too much metadata memory consumption                            
                                                                                
                                                                                
                      [...]
+| datafusion.execution.perfect_hash_join_small_build_threshold            | 
1024                      | A perfect hash join (see `HashJoinExec` for more 
details) will be considered if the range of keys (max - min) on the build side 
is < this threshold. This provides a fast path for joins with very small key 
ranges, bypassing the density check. Currently only supports cases where 
build_side.num_rows() < u32::MAX. Support for build_side.num_rows() >= u32::MAX 
will be added in the future.   [...]
+| datafusion.execution.perfect_hash_join_min_key_density                  | 
0.15                      | The minimum required density of join keys on the 
build side to consider a perfect hash join (see `HashJoinExec` for more 
details). Density is calculated as: `(number of rows) / (max_key - min_key + 
1)`. A perfect hash join may be used if the actual key density > this value. 
Currently only supports cases where build_side.num_rows() < u32::MAX. Support 
for build_side.num_rows() >= u32::M [...]
 | datafusion.execution.coalesce_batches                                   | 
true                      | When set to true, record batches will be examined 
between each operator and small batches will be coalesced into larger batches. 
This is helpful when there are highly selective filters or joins that could 
produce tiny output batches. The target batch size is determined by the 
configuration setting                                                           
                                [...]
 | datafusion.execution.collect_statistics                                 | 
true                      | Should DataFusion collect statistics when first 
creating a table. Has no effect after the table is created. Applies to the 
default `ListingTableProvider` in DataFusion. Defaults to true.                 
                                                                                
                                                                                
                          [...]
 | datafusion.execution.target_partitions                                  | 0  
                       | Number of partitions for query execution. Increasing 
partitions can increase concurrency. Defaults to the number of CPU cores on the 
system                                                                          
                                                                                
                                                                                
                [...]
diff --git a/searchindex.js b/searchindex.js
index 9dff39054e..7397982bde 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
 [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
 [...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index f7652e756c..59245d58eb 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -482,6 +482,14 @@ example, to configure <code class="docutils literal 
notranslate"><span class="pr
 <td><p>8192</p></td>
 <td><p>Default batch size while creating new batches, it’s especially useful 
for buffer-in-memory batches since creating tiny batches would result in too 
much metadata memory consumption</p></td>
 </tr>
+<tr 
class="row-odd"><td><p>datafusion.execution.perfect_hash_join_small_build_threshold</p></td>
+<td><p>1024</p></td>
+<td><p>A perfect hash join (see <code class="docutils literal 
notranslate"><span class="pre">HashJoinExec</span></code> for more details) 
will be considered if the range of keys (max - min) on the build side is &lt; 
this threshold. This provides a fast path for joins with very small key ranges, 
bypassing the density check. Currently only supports cases where 
build_side.num_rows() &lt; u32::MAX. Support for build_side.num_rows() &gt;= 
u32::MAX will be added in the future.</p></td>
+</tr>
+<tr 
class="row-even"><td><p>datafusion.execution.perfect_hash_join_min_key_density</p></td>
+<td><p>0.15</p></td>
+<td><p>The minimum required density of join keys on the build side to consider 
a perfect hash join (see <code class="docutils literal notranslate"><span 
class="pre">HashJoinExec</span></code> for more details). Density is calculated 
as: <code class="docutils literal notranslate"><span class="pre">(number</span> 
<span class="pre">of</span> <span class="pre">rows)</span> <span 
class="pre">/</span> <span class="pre">(max_key</span> <span 
class="pre">-</span> <span class="pre">min_key</span> [...]
+</tr>
 <tr class="row-odd"><td><p>datafusion.execution.coalesce_batches</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, record batches will be examined between each operator 
and small batches will be coalesced into larger batches. This is helpful when 
there are highly selective filters or joins that could produce tiny output 
batches. The target batch size is determined by the configuration 
setting</p></td>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to