bvolpato-dd opened a new pull request, #9390:
URL: https://github.com/apache/arrow-rs/pull/9390

   # Which issue does this PR close?
   
   - Closes #9389.
   
   # Rationale for this change
   
   `BatchCoalescer::push_batch` currently uses a hard `assert_eq!` to check 
that the incoming batch has the same number of columns as the coalescer's 
schema. If there's a mismatch, the whole process panics.
   
   The function already returns `Result<(), ArrowError>`, so there's no reason 
this can't be an error return instead. This is also how the rest of the arrow 
API handles the same situation — `RecordBatch::try_new` returns 
`Err(ArrowError::InvalidArgumentError)` for column count mismatches, and other 
checks in the same struct use `debug_assert!`.
   
   We ran into this in production where a connector returned batches with a 
different schema than the plan expected. Instead of a query-level error, the 
whole process went down.
   
   # What changes are included in this PR?
   
   - Replace `assert_eq!(arrays.len(), self.in_progress_arrays.len())` with an 
`if` check that returns `Err(ArrowError::InvalidArgumentError(...))`.
   - Add three tests covering both directions of mismatch (fewer columns, more 
columns, zero-vs-two).
   
   # Are these changes tested?
   
   Yes — three new tests:
   
   - `test_push_batch_schema_mismatch_fewer_columns` — coalescer expects 0 
columns, batch has 1
   - `test_push_batch_schema_mismatch_more_columns` — coalescer expects 2 
columns, batch has 1
   - `test_push_batch_schema_mismatch_two_vs_zero` — coalescer expects 0 
columns, batch has 2 (matches the exact error we saw in production)
   
   # Are there any user-facing changes?
   
   `BatchCoalescer::push_batch` now returns 
`Err(ArrowError::InvalidArgumentError)` on column count mismatch instead of 
panicking. Any caller that was relying on the panic (unlikely) would need to 
handle the error instead.
   
   Made with [Cursor](https://cursor.com)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to