AdamGS commented on code in PR #628:
URL:
https://github.com/apache/arrow-rs-object-store/pull/628#discussion_r2742994396
##########
src/local.rs:
##########
@@ -988,26 +988,79 @@ pub(crate) fn read_range(
// Don't read past end of file
let to_read = range.end.min(file_len) - range.start;
+ let mut buf = Vec::with_capacity(to_read as usize);
- file.seek(SeekFrom::Start(range.start)).map_err(|source| {
- let path = path.into();
- Error::Seek { source, path }
- })?;
+ #[cfg(any(target_family = "unix", target_family = "windows"))]
+ {
+ #[cfg(target_family = "unix")]
+ use std::os::unix::fs::FileExt;
+ #[cfg(target_family = "windows")]
+ use std::os::windows::fs::FileExt;
+
+ // Safety:
+ // Setting the buffer's length to its capacity is safe as it remains
within its allocated memory.
+ // In cases where `read_exact_at` errors, the contents of the buffer
are undefined,
+ // but we discard it without using it.
+ unsafe {
+ buf.set_len(to_read as usize);
+ }
Review Comment:
just preallocated the buffer with 0s (`buf.resize(to_read as usize, 0_u8)`),
results are:
```
read_ranges/local_fs/10 time: [24.271 µs 24.605 µs 24.979 µs]
thrpt: [3.0543 GiB/s 3.1008 GiB/s 3.1434 GiB/s]
change:
time: [−9.2694% −7.9576% −6.6172%] (p = 0.00 <
0.05)
thrpt: [+7.0861% +8.6456% +10.216%]
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) high mild
4 (4.00%) high severe
read_ranges/local_fs/100
time: [91.591 µs 92.250 µs 92.981 µs]
thrpt: [8.2054 GiB/s 8.2703 GiB/s 8.3299 GiB/s]
change:
time: [−19.988% −19.050% −18.121%] (p = 0.00 <
0.05)
thrpt: [+22.131% +23.532% +24.982%]
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
5 (5.00%) high mild
5 (5.00%) high severe
read_ranges/local_fs/1000
time: [877.00 µs 891.02 µs 907.75 µs]
thrpt: [8.4047 GiB/s 8.5626 GiB/s 8.6994 GiB/s]
change:
time: [−32.377% −30.518% −28.468%] (p = 0.00 <
0.05)
thrpt: [+39.797% +43.921% +47.879%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
get_opts_whole_file/local_fs
time: [8.2127 ms 8.2487 ms 8.2891 ms]
thrpt: [7.5400 GiB/s 7.5770 GiB/s 7.6101 GiB/s]
change:
time: [−3.7223% −2.3673% −1.0502%] (p = 0.00 <
0.05)
thrpt: [+1.0613% +2.4247% +3.8663%]
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) high mild
9 (9.00%) high severe
get_range_sequential/local_fs/10
time: [223.74 µs 225.81 µs 228.25 µs]
thrpt: [342.28 MiB/s 345.97 MiB/s 349.18 MiB/s]
change:
time: [−8.4366% −7.0562% −5.8343%] (p = 0.00 <
0.05)
thrpt: [+6.1958% +7.5919% +9.2139%]
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
get_range_sequential/local_fs/100
time: [2.2283 ms 2.2480 ms 2.2706 ms]
thrpt: [344.07 MiB/s 347.54 MiB/s 350.61 MiB/s]
change:
time: [−8.9263% −7.8699% −6.7098%] (p = 0.00 <
0.05)
thrpt: [+7.1924% +8.5421% +9.8012%]
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
get_range_sequential/local_fs/1000
time: [23.699 ms 23.843 ms 24.011 ms]
thrpt: [325.37 MiB/s 327.66 MiB/s 329.65 MiB/s]
change:
time: [+1.8221% +2.7832% +3.8109%] (p = 0.00 <
0.05)
thrpt: [−3.6710% −2.7079% −1.7895%]
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
6 (6.00%) high severe
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]