vigneshsiva11 commented on PR #9369:
URL: https://github.com/apache/arrow-rs/pull/9369#issuecomment-3901843325

   Thanks for the question.
   
   PR #9362 prevents the overflow by stopping the batch early inside the byte 
array decoder when approaching the 2GB i32 offset limit.
   
   This PR (#9369) operates at a higher level in the ParquetRecordBatchReader, 
ensuring that when a batch would overflow due to accumulated binary offsets, we 
emit the current partial RecordBatch safely and continue processing remaining 
rows in subsequent batches.
   
   So in short:
   
   - #9362 → fixes the issue at the decoder level (low-level safety)
   - #9369 → handles safe batch splitting at the reader level (high-level batch 
management)
   
   They address the same root problem but at different layers of the stack, and 
this PR ensures correct batch semantics when overflow conditions occur.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to