This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/main by this push:
new 138368cc9c fix: reset the offset of 'file_for_view' (#8381)
138368cc9c is described below
commit 138368cc9c9aec2fd40afe2050b1054caaa3dd55
Author: Van De Bio <[email protected]>
AuthorDate: Sat Sep 20 02:58:26 2025 +0800
fix: reset the offset of 'file_for_view' (#8381)
# Which issue does this PR close?
- Closes #8380
# Rationale for this change
It will fix the example to help user get the result of utf8view
performance.
# What changes are included in this PR?
Reset the file handle's offset before using it in next time.
# Are these changes tested?
```
arrow-avro/examples/read_with_utf8view.rs
```
The example can run successfully
```shell
(base) ➜ arrow-avro git:(fix/example_of_utf8view) cargo run --package
arrow-avro --example read_with_utf8view -- test/data/nested_record_reuse.avro
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.30s
Running
`/Users/trevor.wang/Workspace/rust/fix-arrow-rs/arrow-rs/target/debug/examples/read_with_utf8view
test/data/nested_record_reuse.avro`
Read 2 rows from test/data/nested_record_reuse.avro
Reading with StringArray: 2.095417ms
Reading with StringViewArray: 179.333µs
StringViewArray was 11.68x faster
```
# Are there any user-facing changes?
Every user will get the right result and run the example to get the
perfomrance data.
---------
Co-authored-by: Trevor Wang <[email protected]>
---
arrow-avro/examples/read_with_utf8view.rs | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arrow-avro/examples/read_with_utf8view.rs
b/arrow-avro/examples/read_with_utf8view.rs
index 707be57516..85b07c8d03 100644
--- a/arrow-avro/examples/read_with_utf8view.rs
+++ b/arrow-avro/examples/read_with_utf8view.rs
@@ -22,7 +22,7 @@
use std::env;
use std::fs::File;
-use std::io::BufReader;
+use std::io::{BufReader, Seek, SeekFrom};
use std::time::Instant;
use arrow_array::{RecordBatch, StringArray, StringViewArray};
@@ -39,7 +39,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
};
let file = File::open(file_path)?;
- let file_for_view = file.try_clone()?;
+ let mut file_for_view = file.try_clone()?;
let start = Instant::now();
let reader = BufReader::new(file);
@@ -48,6 +48,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let batches: Vec<RecordBatch> = avro_reader.collect::<Result<_, _>>()?;
let regular_duration = start.elapsed();
+ file_for_view.seek(SeekFrom::Start(0))?;
let start = Instant::now();
let reader_view = BufReader::new(file_for_view);
let avro_reader_view = ReaderBuilder::new()