This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/main by this push:
     new da05287c0f Fix FileStream scanning_total to include sync next-file 
open time (#20627)
da05287c0f is described below

commit da05287c0f11f5450c05ddc5a9fdc5fb5bb1abee
Author: Ratul Dawar <[email protected]>
AuthorDate: Wed Mar 11 16:28:09 2026 +0530

    Fix FileStream scanning_total to include sync next-file open time (#20627)
    
    ## Summary
    - include synchronous `start_next_file()` / `FileOpener::open()` setup
    time in `time_elapsed_scanning_total`
    - keep existing `time_opening` and scanning timers lifecycle intact
    - avoid timer overlap by scoping the temporary timer before calling
    `time_scanning_total.start()`
    
    ## Details
    In `FileStreamState::Open`, `start_next_file()` is invoked before
    `time_scanning_total.start()`. If `open()` performs synchronous work
    before returning the future, that time was previously unaccounted for in
    `time_elapsed_scanning_total`.
    
    This change wraps the `start_next_file()` call in a scoped timer on the
    same `time_scanning_total` metric so the missing segment is recorded.
    
    - Fixes #20571
    
    ## Validation
    I tested by reading CSV files via AWS S3.
    
    ---------
    
    Co-authored-by: Andrew Lamb <[email protected]>
---
 datafusion/datasource/src/file_stream.rs | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/datafusion/datasource/src/file_stream.rs 
b/datafusion/datasource/src/file_stream.rs
index 514a7e0a0b..b75e66849b 100644
--- a/datafusion/datasource/src/file_stream.rs
+++ b/datafusion/datasource/src/file_stream.rs
@@ -127,7 +127,15 @@ impl FileStream {
                         self.file_stream_metrics.files_opened.add(1);
                         // include time needed to start opening in 
`start_next_file`
                         self.file_stream_metrics.time_opening.stop();
-                        let next = self.start_next_file().transpose();
+                        let next = {
+                            let scanning_total_metric = self
+                                .file_stream_metrics
+                                .time_scanning_total
+                                .metrics
+                                .clone();
+                            let _timer = scanning_total_metric.timer();
+                            self.start_next_file().transpose()
+                        };
                         
self.file_stream_metrics.time_scanning_until_data.start();
                         self.file_stream_metrics.time_scanning_total.start();
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to