This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new da05287c0f Fix FileStream scanning_total to include sync next-file
open time (#20627)
da05287c0f is described below
commit da05287c0f11f5450c05ddc5a9fdc5fb5bb1abee
Author: Ratul Dawar <[email protected]>
AuthorDate: Wed Mar 11 16:28:09 2026 +0530
Fix FileStream scanning_total to include sync next-file open time (#20627)
## Summary
- include synchronous `start_next_file()` / `FileOpener::open()` setup
time in `time_elapsed_scanning_total`
- keep existing `time_opening` and scanning timers lifecycle intact
- avoid timer overlap by scoping the temporary timer before calling
`time_scanning_total.start()`
## Details
In `FileStreamState::Open`, `start_next_file()` is invoked before
`time_scanning_total.start()`. If `open()` performs synchronous work
before returning the future, that time was previously unaccounted for in
`time_elapsed_scanning_total`.
This change wraps the `start_next_file()` call in a scoped timer on the
same `time_scanning_total` metric so the missing segment is recorded.
- Fixes #20571
## Validation
I tested by reading CSV files via AWS S3.
---------
Co-authored-by: Andrew Lamb <[email protected]>
---
datafusion/datasource/src/file_stream.rs | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/datafusion/datasource/src/file_stream.rs
b/datafusion/datasource/src/file_stream.rs
index 514a7e0a0b..b75e66849b 100644
--- a/datafusion/datasource/src/file_stream.rs
+++ b/datafusion/datasource/src/file_stream.rs
@@ -127,7 +127,15 @@ impl FileStream {
self.file_stream_metrics.files_opened.add(1);
// include time needed to start opening in
`start_next_file`
self.file_stream_metrics.time_opening.stop();
- let next = self.start_next_file().transpose();
+ let next = {
+ let scanning_total_metric = self
+ .file_stream_metrics
+ .time_scanning_total
+ .metrics
+ .clone();
+ let _timer = scanning_total_metric.timer();
+ self.start_next_file().transpose()
+ };
self.file_stream_metrics.time_scanning_until_data.start();
self.file_stream_metrics.time_scanning_total.start();
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]