This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 43a1f2ff48 Publish built docs triggered by 
8a91db56b38a09f528d2bd13732a195cf69d58dc
43a1f2ff48 is described below

commit 43a1f2ff481621f47c1638f757513fb97b46f639
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Thu Nov 20 23:57:30 2025 +0000

    Publish built docs triggered by 8a91db56b38a09f528d2bd13732a195cf69d58dc
---
 _sources/library-user-guide/upgrading.md.txt | 81 ++++++++++++++++++++++++++++
 library-user-guide/upgrading.html            | 72 +++++++++++++++++++++++++
 searchindex.js                               |  2 +-
 3 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/_sources/library-user-guide/upgrading.md.txt 
b/_sources/library-user-guide/upgrading.md.txt
index 763432626c..e116bfffed 100644
--- a/_sources/library-user-guide/upgrading.md.txt
+++ b/_sources/library-user-guide/upgrading.md.txt
@@ -25,6 +25,87 @@
 
 You can see the current [status of the `52.0.0` release 
here](https://github.com/apache/datafusion/issues/18566)
 
+### Statistics handling moved from `FileSource` to `FileScanConfig`
+
+Statistics are now managed directly by `FileScanConfig` instead of being 
delegated to `FileSource` implementations. This simplifies the `FileSource` 
trait and provides more consistent statistics handling across all file formats.
+
+**Who is affected:**
+
+- Users who have implemented custom `FileSource` implementations
+
+**Breaking changes:**
+
+Two methods have been removed from the `FileSource` trait:
+
+- `with_statistics(&self, statistics: Statistics) -> Arc<dyn FileSource>`
+- `statistics(&self) -> Result<Statistics>`
+
+**Migration guide:**
+
+If you have a custom `FileSource` implementation, you need to:
+
+1. Remove the `with_statistics` method implementation
+2. Remove the `statistics` method implementation
+3. Remove any internal state that was storing statistics
+
+**Before:**
+
+```rust,ignore
+#[derive(Clone)]
+struct MyCustomSource {
+    table_schema: TableSchema,
+    projected_statistics: Option<Statistics>,
+    // other fields...
+}
+
+impl FileSource for MyCustomSource {
+    fn with_statistics(&self, statistics: Statistics) -> Arc<dyn FileSource> {
+        Arc::new(Self {
+            table_schema: self.table_schema.clone(),
+            projected_statistics: Some(statistics),
+            // other fields...
+        })
+    }
+
+    fn statistics(&self) -> Result<Statistics> {
+        Ok(self.projected_statistics.clone().unwrap_or_else(||
+            Statistics::new_unknown(self.table_schema.file_schema())
+        ))
+    }
+
+    // other methods...
+}
+```
+
+**After:**
+
+```rust,ignore
+#[derive(Clone)]
+struct MyCustomSource {
+    table_schema: TableSchema,
+    // projected_statistics field removed
+    // other fields...
+}
+
+impl FileSource for MyCustomSource {
+    // with_statistics method removed
+    // statistics method removed
+
+    // other methods...
+}
+```
+
+**Accessing statistics:**
+
+Statistics are now accessed through `FileScanConfig` instead of `FileSource`:
+
+```diff
+- let stats = config.file_source.statistics()?;
++ let stats = config.statistics();
+```
+
+Note that `FileScanConfig::statistics()` automatically marks statistics as 
inexact when filters are present, ensuring correctness when filters are pushed 
down.
+
 ### Planner now requires explicit opt-in for WITHIN GROUP syntax
 
 The SQL planner now enforces the aggregate UDF contract more strictly: the
diff --git a/library-user-guide/upgrading.html 
b/library-user-guide/upgrading.html
index 8b41810481..3fda9880ec 100644
--- a/library-user-guide/upgrading.html
+++ b/library-user-guide/upgrading.html
@@ -407,6 +407,77 @@
 <h2>DataFusion <code class="docutils literal notranslate"><span 
class="pre">52.0.0</span></code><a class="headerlink" href="#datafusion-52-0-0" 
title="Link to this heading">#</a></h2>
 <p><strong>Note:</strong> DataFusion <code class="docutils literal 
notranslate"><span class="pre">52.0.0</span></code> has not been released yet. 
The information provided in this section pertains to features and changes that 
have already been merged to the main branch and are awaiting release in this 
version.</p>
 <p>You can see the current <a class="reference external" 
href="https://github.com/apache/datafusion/issues/18566";>status of the <code 
class="docutils literal notranslate"><span class="pre">52.0.0</span></code> 
release here</a></p>
+<section id="statistics-handling-moved-from-filesource-to-filescanconfig">
+<h3>Statistics handling moved from <code class="docutils literal 
notranslate"><span class="pre">FileSource</span></code> to <code 
class="docutils literal notranslate"><span 
class="pre">FileScanConfig</span></code><a class="headerlink" 
href="#statistics-handling-moved-from-filesource-to-filescanconfig" title="Link 
to this heading">#</a></h3>
+<p>Statistics are now managed directly by <code class="docutils literal 
notranslate"><span class="pre">FileScanConfig</span></code> instead of being 
delegated to <code class="docutils literal notranslate"><span 
class="pre">FileSource</span></code> implementations. This simplifies the <code 
class="docutils literal notranslate"><span class="pre">FileSource</span></code> 
trait and provides more consistent statistics handling across all file 
formats.</p>
+<p><strong>Who is affected:</strong></p>
+<ul class="simple">
+<li><p>Users who have implemented custom <code class="docutils literal 
notranslate"><span class="pre">FileSource</span></code> implementations</p></li>
+</ul>
+<p><strong>Breaking changes:</strong></p>
+<p>Two methods have been removed from the <code class="docutils literal 
notranslate"><span class="pre">FileSource</span></code> trait:</p>
+<ul class="simple">
+<li><p><code class="docutils literal notranslate"><span 
class="pre">with_statistics(&amp;self,</span> <span 
class="pre">statistics:</span> <span class="pre">Statistics)</span> <span 
class="pre">-&gt;</span> <span class="pre">Arc&lt;dyn</span> <span 
class="pre">FileSource&gt;</span></code></p></li>
+<li><p><code class="docutils literal notranslate"><span 
class="pre">statistics(&amp;self)</span> <span class="pre">-&gt;</span> <span 
class="pre">Result&lt;Statistics&gt;</span></code></p></li>
+</ul>
+<p><strong>Migration guide:</strong></p>
+<p>If you have a custom <code class="docutils literal notranslate"><span 
class="pre">FileSource</span></code> implementation, you need to:</p>
+<ol class="arabic simple">
+<li><p>Remove the <code class="docutils literal notranslate"><span 
class="pre">with_statistics</span></code> method implementation</p></li>
+<li><p>Remove the <code class="docutils literal notranslate"><span 
class="pre">statistics</span></code> method implementation</p></li>
+<li><p>Remove any internal state that was storing statistics</p></li>
+</ol>
+<p><strong>Before:</strong></p>
+<div class="highlight-rust notranslate"><div 
class="highlight"><pre><span></span><span class="cp">#[derive(Clone)]</span>
+<span class="k">struct</span><span class="w"> </span><span 
class="nc">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w">    </span><span class="n">table_schema</span><span 
class="p">:</span><span class="w"> </span><span 
class="nc">TableSchema</span><span class="p">,</span>
+<span class="w">    </span><span class="n">projected_statistics</span><span 
class="p">:</span><span class="w"> </span><span class="nb">Option</span><span 
class="o">&lt;</span><span class="n">Statistics</span><span 
class="o">&gt;</span><span class="p">,</span>
+<span class="w">    </span><span class="c1">// other fields...</span>
+<span class="p">}</span>
+
+<span class="k">impl</span><span class="w"> </span><span 
class="n">FileSource</span><span class="w"> </span><span 
class="k">for</span><span class="w"> </span><span 
class="n">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w">    </span><span class="k">fn</span><span class="w"> 
</span><span class="nf">with_statistics</span><span class="p">(</span><span 
class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span 
class="w"> </span><span class="n">statistics</span><span 
class="p">:</span><span class="w"> </span><span 
class="nc">Statistics</span><span class="p">)</span><span class="w"> 
</span><span class="p">-&gt;</span><span class="w"> </span><span 
class="nc">Arc</span><span c [...]
+<span class="w">        </span><span class="n">Arc</span><span 
class="p">::</span><span class="n">new</span><span class="p">(</span><span 
class="bp">Self</span><span class="w"> </span><span class="p">{</span>
+<span class="w">            </span><span class="n">table_schema</span><span 
class="p">:</span><span class="w"> </span><span class="nc">self</span><span 
class="p">.</span><span class="n">table_schema</span><span 
class="p">.</span><span class="n">clone</span><span class="p">(),</span>
+<span class="w">            </span><span 
class="n">projected_statistics</span><span class="p">:</span><span class="w"> 
</span><span class="nb">Some</span><span class="p">(</span><span 
class="n">statistics</span><span class="p">),</span>
+<span class="w">            </span><span class="c1">// other fields...</span>
+<span class="w">        </span><span class="p">})</span>
+<span class="w">    </span><span class="p">}</span>
+
+<span class="w">    </span><span class="k">fn</span><span class="w"> 
</span><span class="nf">statistics</span><span class="p">(</span><span 
class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span 
class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span 
class="nb">Result</span><span class="o">&lt;</span><span 
class="n">Statistics</span><span class="o">&gt;</span><span class="w"> 
</span><span class="p">{</span>
+<span class="w">        </span><span class="nb">Ok</span><span 
class="p">(</span><span class="bp">self</span><span class="p">.</span><span 
class="n">projected_statistics</span><span class="p">.</span><span 
class="n">clone</span><span class="p">().</span><span 
class="n">unwrap_or_else</span><span class="p">(</span><span class="o">||</span>
+<span class="w">            </span><span class="n">Statistics</span><span 
class="p">::</span><span class="n">new_unknown</span><span 
class="p">(</span><span class="bp">self</span><span class="p">.</span><span 
class="n">table_schema</span><span class="p">.</span><span 
class="n">file_schema</span><span class="p">())</span>
+<span class="w">        </span><span class="p">))</span>
+<span class="w">    </span><span class="p">}</span>
+
+<span class="w">    </span><span class="c1">// other methods...</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p><strong>After:</strong></p>
+<div class="highlight-rust notranslate"><div 
class="highlight"><pre><span></span><span class="cp">#[derive(Clone)]</span>
+<span class="k">struct</span><span class="w"> </span><span 
class="nc">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w">    </span><span class="n">table_schema</span><span 
class="p">:</span><span class="w"> </span><span 
class="nc">TableSchema</span><span class="p">,</span>
+<span class="w">    </span><span class="c1">// projected_statistics field 
removed</span>
+<span class="w">    </span><span class="c1">// other fields...</span>
+<span class="p">}</span>
+
+<span class="k">impl</span><span class="w"> </span><span 
class="n">FileSource</span><span class="w"> </span><span 
class="k">for</span><span class="w"> </span><span 
class="n">MyCustomSource</span><span class="w"> </span><span class="p">{</span>
+<span class="w">    </span><span class="c1">// with_statistics method 
removed</span>
+<span class="w">    </span><span class="c1">// statistics method removed</span>
+
+<span class="w">    </span><span class="c1">// other methods...</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p><strong>Accessing statistics:</strong></p>
+<p>Statistics are now accessed through <code class="docutils literal 
notranslate"><span class="pre">FileScanConfig</span></code> instead of <code 
class="docutils literal notranslate"><span 
class="pre">FileSource</span></code>:</p>
+<div class="highlight-diff notranslate"><div 
class="highlight"><pre><span></span><span class="gd">- let stats = 
config.file_source.statistics()?;</span>
+<span class="gi">+ let stats = config.statistics();</span>
+</pre></div>
+</div>
+<p>Note that <code class="docutils literal notranslate"><span 
class="pre">FileScanConfig::statistics()</span></code> automatically marks 
statistics as inexact when filters are present, ensuring correctness when 
filters are pushed down.</p>
+</section>
 <section id="planner-now-requires-explicit-opt-in-for-within-group-syntax">
 <h3>Planner now requires explicit opt-in for WITHIN GROUP syntax<a 
class="headerlink" 
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax" 
title="Link to this heading">#</a></h3>
 <p>The SQL planner now enforces the aggregate UDF contract more strictly: the
@@ -1684,6 +1755,7 @@ take care of constructing the <code class="docutils 
literal notranslate"><span c
   <nav class="bd-toc-nav page-toc" 
aria-labelledby="pst-page-navigation-heading-2">
     <ul class="visible nav section-nav flex-column">
 <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" 
href="#datafusion-52-0-0">DataFusion <code class="docutils literal 
notranslate"><span class="pre">52.0.0</span></code></a><ul class="nav 
section-nav flex-column">
+<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#statistics-handling-moved-from-filesource-to-filescanconfig">Statistics 
handling moved from <code class="docutils literal notranslate"><span 
class="pre">FileSource</span></code> to <code class="docutils literal 
notranslate"><span class="pre">FileScanConfig</span></code></a></li>
 <li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax">Planner 
now requires explicit opt-in for WITHIN GROUP syntax</a></li>
 <li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false"><code
 class="docutils literal notranslate"><span 
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now 
defaults to <code class="docutils literal notranslate"><span 
class="pre">false</span></code></a></li>
 <li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#api-change-for-cacheaccessor-trait">API change for <code class="docutils 
literal notranslate"><span class="pre">CacheAccessor</span></code> 
trait</a></li>
diff --git a/searchindex.js b/searchindex.js
index 6841d576f4..916d59792f 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
 [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
 [...]
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to