pchintar opened a new issue, #9777:
URL: https://github.com/apache/arrow-rs/issues/9777
## Description
In `arrow-ipc/src/reader.rs`, buffers are currently allocated with
`MutableBuffer::from_len_zeroed(len)` and then passed to `read_exact`, which
fully overwrites the entire buffer contents.
This results in an unnecessary full memory pass:
* first zero-initialization
* then complete overwrite by the incoming data
Since `read_exact` guarantees that the provided slice is fully written on
success, the initial zeroing step is redundant and can be safely avoided by
allocating with capacity and setting the length before the read.
## Proposed Change
Replace patterns of the form:
```rust
let mut buf = MutableBuffer::from_len_zeroed(len);
reader.read_exact(&mut buf)?;
```
with:
```rust
let mut buf = MutableBuffer::with_capacity(len);
unsafe { buf.set_len(len) };
reader.read_exact(buf.as_slice_mut())?;
```
This removes one full memory write pass per read.
## Safety Considerations
The use of `unsafe set_len` is sound in this context because:
* the buffer is not accessed or observed between `set_len` and `read_exact`
* `read_exact` either fully initializes the buffer or returns an error
* on error, the function returns immediately and the buffer is not used
* there is no exposure of partially initialized data to safe Rust code
Crash scenarios (e.g., panic, early return, process termination) do not
introduce unsoundness because:
* if execution stops before `read_exact` completes, the buffer is never
observed
* there is no path where uninitialized memory is read after interruption
This invariant is local and can be maintained by ensuring no intermediate
access is introduced between `set_len` and the successful completion of
`read_exact`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]