pchintar opened a new pull request, #9971: URL: https://github.com/apache/arrow-rs/pull/9971
# Which issue does this PR close? - Closes #9777 . # Rationale for this change This PR is a follow-up to the alignment concerns raised in #9778 when using `Vec<u8>` for IPC body reads to replace the current `MutableBuffer::from_len_zeroed` in IPC Reader. My [earlier approach](https://github.com/apache/arrow-rs/pull/9778/changes) showed that reading directly into `Vec<u8>` could substantially reduce redundant zero-filling in IPC reader paths, but some decode paths still relied on fixed-width typed buffers that could require additional alignment handling cost later during array construction. This PR keeps the `Vec<u8>`-based read path for IPC message and block bodies, while adding typed IPC buffer handling for fixed-width physical buffers before array construction. This preserves the existing alignment behavior for those fixed-width decode paths while avoiding the additional alignment handling/copying costs that could otherwise occur later during array construction. The typed-buffer handling now covers: * primitive and primitive-like arrays * binary/string offset buffers * list and list-view offsets/sizes * dictionary index buffers * union type id and offset buffers * view buffers These paths now read their physical buffers through `next_typed_buffer::<T>()` so the expected physical buffer lengths are derived from the native value type before array construction. Container types such as `Struct`, `FixedSizeList`, `RunEndEncoded`, and similar nested/container arrays were intentionally left on their existing decode paths because they do not directly own fixed-width value buffers at that level. Their child arrays continue to decode recursively through the updated typed-buffer paths where applicable. # What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> # Are these changes tested? Yes. The existing IPC reader test suite was run with: ```bash cargo test -p arrow-ipc --lib ``` IPC reader benchmark was also run with: ```bash cargo bench -p arrow-ipc --bench ipc_reader --features zstd ``` The non-compressed, non-mmap IPC reader paths showed consistent improvements locally. Compressed and mmap-heavy paths were mostly neutral, as expected. # Are there any user-facing changes? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
