goutamadwant opened a new pull request, #10230:
URL: https://github.com/apache/arrow-rs/pull/10230

   # Which issue does this PR close?
   
   - Closes #10213.
   
   # Rationale for this change
   
   Arrow IPC dictionary batches write a dictionary field's values as the batch 
payload. When those values are themselves dictionary-encoded, the writer can 
produce IPC data that readers cannot decode, failing later with a buffer 
metadata mismatch.
   
   # What changes are included in this PR?
   
   This adds schema validation to the IPC file and stream writer constructors 
so direct dictionary-of-dictionary schemas return a clear 
`InvalidArgumentError` before any IPC bytes are written. A low-level dictionary 
encoding guard is kept as a backstop.
   
   A regression test covers the direct `Dictionary(_, Dictionary(_, _))` case 
for both IPC stream and file writers.
   
   # Are these changes tested?
   
   Yes.
   
   - `cargo fmt --all -- --check`
   - `cargo test -p arrow-ipc`
   - `cargo test -p arrow-ipc --all-features`
   - `cargo clippy -p arrow-ipc --all-targets -- -D warnings`
   - `cargo clippy -p arrow-ipc --all-targets --all-features -- -D warnings`
   
   # Are there any user-facing changes?
   
   Yes. The IPC file and stream writers now return a clear error for direct 
dictionary-of-dictionary schemas instead of writing IPC data that fails during 
read. There are no public API changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to