This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 43a1f2ff48 Publish built docs triggered by
8a91db56b38a09f528d2bd13732a195cf69d58dc
43a1f2ff48 is described below
commit 43a1f2ff481621f47c1638f757513fb97b46f639
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Thu Nov 20 23:57:30 2025 +0000
Publish built docs triggered by 8a91db56b38a09f528d2bd13732a195cf69d58dc
---
_sources/library-user-guide/upgrading.md.txt | 81 ++++++++++++++++++++++++++++
library-user-guide/upgrading.html | 72 +++++++++++++++++++++++++
searchindex.js | 2 +-
3 files changed, 154 insertions(+), 1 deletion(-)
diff --git a/_sources/library-user-guide/upgrading.md.txt
b/_sources/library-user-guide/upgrading.md.txt
index 763432626c..e116bfffed 100644
--- a/_sources/library-user-guide/upgrading.md.txt
+++ b/_sources/library-user-guide/upgrading.md.txt
@@ -25,6 +25,87 @@
You can see the current [status of the `52.0.0` release
here](https://github.com/apache/datafusion/issues/18566)
+### Statistics handling moved from `FileSource` to `FileScanConfig`
+
+Statistics are now managed directly by `FileScanConfig` instead of being
delegated to `FileSource` implementations. This simplifies the `FileSource`
trait and provides more consistent statistics handling across all file formats.
+
+**Who is affected:**
+
+- Users who have implemented custom `FileSource` implementations
+
+**Breaking changes:**
+
+Two methods have been removed from the `FileSource` trait:
+
+- `with_statistics(&self, statistics: Statistics) -> Arc<dyn FileSource>`
+- `statistics(&self) -> Result<Statistics>`
+
+**Migration guide:**
+
+If you have a custom `FileSource` implementation, you need to:
+
+1. Remove the `with_statistics` method implementation
+2. Remove the `statistics` method implementation
+3. Remove any internal state that was storing statistics
+
+**Before:**
+
+```rust,ignore
+#[derive(Clone)]
+struct MyCustomSource {
+ table_schema: TableSchema,
+ projected_statistics: Option<Statistics>,
+ // other fields...
+}
+
+impl FileSource for MyCustomSource {
+ fn with_statistics(&self, statistics: Statistics) -> Arc<dyn FileSource> {
+ Arc::new(Self {
+ table_schema: self.table_schema.clone(),
+ projected_statistics: Some(statistics),
+ // other fields...
+ })
+ }
+
+ fn statistics(&self) -> Result<Statistics> {
+ Ok(self.projected_statistics.clone().unwrap_or_else(||
+ Statistics::new_unknown(self.table_schema.file_schema())
+ ))
+ }
+
+ // other methods...
+}
+```
+
+**After:**
+
+```rust,ignore
+#[derive(Clone)]
+struct MyCustomSource {
+ table_schema: TableSchema,
+ // projected_statistics field removed
+ // other fields...
+}
+
+impl FileSource for MyCustomSource {
+ // with_statistics method removed
+ // statistics method removed
+
+ // other methods...
+}
+```
+
+**Accessing statistics:**
+
+Statistics are now accessed through `FileScanConfig` instead of `FileSource`:
+
+```diff
+- let stats = config.file_source.statistics()?;
++ let stats = config.statistics();
+```
+
+Note that `FileScanConfig::statistics()` automatically marks statistics as
inexact when filters are present, ensuring correctness when filters are pushed
down.
+
### Planner now requires explicit opt-in for WITHIN GROUP syntax
The SQL planner now enforces the aggregate UDF contract more strictly: the
diff --git a/library-user-guide/upgrading.html
b/library-user-guide/upgrading.html
index 8b41810481..3fda9880ec 100644
--- a/library-user-guide/upgrading.html
+++ b/library-user-guide/upgrading.html
@@ -407,6 +407,77 @@
<h2>DataFusion <code class="docutils literal notranslate"><span
class="pre">52.0.0</span></code><a class="headerlink" href="#datafusion-52-0-0"
title="Link to this heading">#</a></h2>
<p><strong>Note:</strong> DataFusion <code class="docutils literal
notranslate"><span class="pre">52.0.0</span></code> has not been released yet.
The information provided in this section pertains to features and changes that
have already been merged to the main branch and are awaiting release in this
version.</p>
<p>You can see the current <a class="reference external"
href="https://github.com/apache/datafusion/issues/18566">status of the <code
class="docutils literal notranslate"><span class="pre">52.0.0</span></code>
release here</a></p>
+<section id="statistics-handling-moved-from-filesource-to-filescanconfig">
+<h3>Statistics handling moved from <code class="docutils literal
notranslate"><span class="pre">FileSource</span></code> to <code
class="docutils literal notranslate"><span
class="pre">FileScanConfig</span></code><a class="headerlink"
href="#statistics-handling-moved-from-filesource-to-filescanconfig" title="Link
to this heading">#</a></h3>
+<p>Statistics are now managed directly by <code class="docutils literal
notranslate"><span class="pre">FileScanConfig</span></code> instead of being
delegated to <code class="docutils literal notranslate"><span
class="pre">FileSource</span></code> implementations. This simplifies the <code
class="docutils literal notranslate"><span class="pre">FileSource</span></code>
trait and provides more consistent statistics handling across all file
formats.</p>
+<p><strong>Who is affected:</strong></p>
+<ul class="simple">
+<li><p>Users who have implemented custom <code class="docutils literal
notranslate"><span class="pre">FileSource</span></code> implementations</p></li>
+</ul>
+<p><strong>Breaking changes:</strong></p>
+<p>Two methods have been removed from the <code class="docutils literal
notranslate"><span class="pre">FileSource</span></code> trait:</p>
+<ul class="simple">
+<li><p><code class="docutils literal notranslate"><span
class="pre">with_statistics(&self,</span> <span
class="pre">statistics:</span> <span class="pre">Statistics)</span> <span
class="pre">-></span> <span class="pre">Arc<dyn</span> <span
class="pre">FileSource></span></code></p></li>
+<li><p><code class="docutils literal notranslate"><span
class="pre">statistics(&self)</span> <span class="pre">-></span> <span
class="pre">Result<Statistics></span></code></p></li>
+</ul>
+<p><strong>Migration guide:</strong></p>
+<p>If you have a custom <code class="docutils literal notranslate"><span
class="pre">FileSource</span></code> implementation, you need to:</p>
+<ol class="arabic simple">
+<li><p>Remove the <code class="docutils literal notranslate"><span
class="pre">with_statistics</span></code> method implementation</p></li>
+<li><p>Remove the <code class="docutils literal notranslate"><span
class="pre">statistics</span></code> method implementation</p></li>
+<li><p>Remove any internal state that was storing statistics</p></li>
+</ol>
+<p><strong>Before:</strong></p>
+<div class="highlight-rust notranslate"><div
class="highlight"><pre><span></span><span class="cp">#[derive(Clone)]</span>
+<span class="k">struct</span><span class="w"> </span><span
class="nc">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w"> </span><span class="n">table_schema</span><span
class="p">:</span><span class="w"> </span><span
class="nc">TableSchema</span><span class="p">,</span>
+<span class="w"> </span><span class="n">projected_statistics</span><span
class="p">:</span><span class="w"> </span><span class="nb">Option</span><span
class="o"><</span><span class="n">Statistics</span><span
class="o">></span><span class="p">,</span>
+<span class="w"> </span><span class="c1">// other fields...</span>
+<span class="p">}</span>
+
+<span class="k">impl</span><span class="w"> </span><span
class="n">FileSource</span><span class="w"> </span><span
class="k">for</span><span class="w"> </span><span
class="n">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w"> </span><span class="k">fn</span><span class="w">
</span><span class="nf">with_statistics</span><span class="p">(</span><span
class="o">&</span><span class="bp">self</span><span class="p">,</span><span
class="w"> </span><span class="n">statistics</span><span
class="p">:</span><span class="w"> </span><span
class="nc">Statistics</span><span class="p">)</span><span class="w">
</span><span class="p">-></span><span class="w"> </span><span
class="nc">Arc</span><span c [...]
+<span class="w"> </span><span class="n">Arc</span><span
class="p">::</span><span class="n">new</span><span class="p">(</span><span
class="bp">Self</span><span class="w"> </span><span class="p">{</span>
+<span class="w"> </span><span class="n">table_schema</span><span
class="p">:</span><span class="w"> </span><span class="nc">self</span><span
class="p">.</span><span class="n">table_schema</span><span
class="p">.</span><span class="n">clone</span><span class="p">(),</span>
+<span class="w"> </span><span
class="n">projected_statistics</span><span class="p">:</span><span class="w">
</span><span class="nb">Some</span><span class="p">(</span><span
class="n">statistics</span><span class="p">),</span>
+<span class="w"> </span><span class="c1">// other fields...</span>
+<span class="w"> </span><span class="p">})</span>
+<span class="w"> </span><span class="p">}</span>
+
+<span class="w"> </span><span class="k">fn</span><span class="w">
</span><span class="nf">statistics</span><span class="p">(</span><span
class="o">&</span><span class="bp">self</span><span class="p">)</span><span
class="w"> </span><span class="p">-></span><span class="w"> </span><span
class="nb">Result</span><span class="o"><</span><span
class="n">Statistics</span><span class="o">></span><span class="w">
</span><span class="p">{</span>
+<span class="w"> </span><span class="nb">Ok</span><span
class="p">(</span><span class="bp">self</span><span class="p">.</span><span
class="n">projected_statistics</span><span class="p">.</span><span
class="n">clone</span><span class="p">().</span><span
class="n">unwrap_or_else</span><span class="p">(</span><span class="o">||</span>
+<span class="w"> </span><span class="n">Statistics</span><span
class="p">::</span><span class="n">new_unknown</span><span
class="p">(</span><span class="bp">self</span><span class="p">.</span><span
class="n">table_schema</span><span class="p">.</span><span
class="n">file_schema</span><span class="p">())</span>
+<span class="w"> </span><span class="p">))</span>
+<span class="w"> </span><span class="p">}</span>
+
+<span class="w"> </span><span class="c1">// other methods...</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p><strong>After:</strong></p>
+<div class="highlight-rust notranslate"><div
class="highlight"><pre><span></span><span class="cp">#[derive(Clone)]</span>
+<span class="k">struct</span><span class="w"> </span><span
class="nc">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w"> </span><span class="n">table_schema</span><span
class="p">:</span><span class="w"> </span><span
class="nc">TableSchema</span><span class="p">,</span>
+<span class="w"> </span><span class="c1">// projected_statistics field
removed</span>
+<span class="w"> </span><span class="c1">// other fields...</span>
+<span class="p">}</span>
+
+<span class="k">impl</span><span class="w"> </span><span
class="n">FileSource</span><span class="w"> </span><span
class="k">for</span><span class="w"> </span><span
class="n">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w"> </span><span class="c1">// with_statistics method
removed</span>
+<span class="w"> </span><span class="c1">// statistics method removed</span>
+
+<span class="w"> </span><span class="c1">// other methods...</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p><strong>Accessing statistics:</strong></p>
+<p>Statistics are now accessed through <code class="docutils literal
notranslate"><span class="pre">FileScanConfig</span></code> instead of <code
class="docutils literal notranslate"><span
class="pre">FileSource</span></code>:</p>
+<div class="highlight-diff notranslate"><div
class="highlight"><pre><span></span><span class="gd">- let stats =
config.file_source.statistics()?;</span>
+<span class="gi">+ let stats = config.statistics();</span>
+</pre></div>
+</div>
+<p>Note that <code class="docutils literal notranslate"><span
class="pre">FileScanConfig::statistics()</span></code> automatically marks
statistics as inexact when filters are present, ensuring correctness when
filters are pushed down.</p>
+</section>
<section id="planner-now-requires-explicit-opt-in-for-within-group-syntax">
<h3>Planner now requires explicit opt-in for WITHIN GROUP syntax<a
class="headerlink"
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax"
title="Link to this heading">#</a></h3>
<p>The SQL planner now enforces the aggregate UDF contract more strictly: the
@@ -1684,6 +1755,7 @@ take care of constructing the <code class="docutils
literal notranslate"><span c
<nav class="bd-toc-nav page-toc"
aria-labelledby="pst-page-navigation-heading-2">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link"
href="#datafusion-52-0-0">DataFusion <code class="docutils literal
notranslate"><span class="pre">52.0.0</span></code></a><ul class="nav
section-nav flex-column">
+<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#statistics-handling-moved-from-filesource-to-filescanconfig">Statistics
handling moved from <code class="docutils literal notranslate"><span
class="pre">FileSource</span></code> to <code class="docutils literal
notranslate"><span class="pre">FileScanConfig</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax">Planner
now requires explicit opt-in for WITHIN GROUP syntax</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false"><code
class="docutils literal notranslate"><span
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now
defaults to <code class="docutils literal notranslate"><span
class="pre">false</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#api-change-for-cacheaccessor-trait">API change for <code class="docutils
literal notranslate"><span class="pre">CacheAccessor</span></code>
trait</a></li>
diff --git a/searchindex.js b/searchindex.js
index 6841d576f4..916d59792f 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
[...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
[...]
\ No newline at end of file
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]