damacek opened a new issue, #9235:
URL: https://github.com/apache/arrow-rs/issues/9235

   **Describe the bug**
   
   Recently, I encountered an assertion failure when passing a sliced 
RecordBatch to the Python FFI. This problem seems to pop out, only when working 
with slices of more RecordBatches containing complex data - nested lists and/or 
structs.
   
   **To Reproduce**
   
   Add a following test case to 
`arrow-pyarrow-integration-testing/tests/test_sql.py`:
   
   ```python
   def test_nested_struct_with_list_slice():
       """
       Test round-tripping sliced record batches with deeply nested struct 
types.
       
       This tests struct<struct<list<struct>>> with variable-length lists,
       ensuring that slicing at different row offsets works correctly.
       """
       # Build the nested type: struct<struct<list<struct>>>
       item_type = pa.struct([("x", pa.int64())])
       inner_struct_type = pa.struct([("items", pa.list_(item_type))])
       outer_struct_type = pa.struct([("inner", inner_struct_type)])
   
       # Key: variable-length inner lists (1, 2, 1 items)
       batch = pa.record_batch(
           [
               pa.array([1, 2, 3], type=pa.int64()),
               pa.array([
                   {"inner": {"items": [{"x": 1}]}},
                   {"inner": {"items": [{"x": 2}, {"x": 3}]}},
                   {"inner": {"items": [{"x": 4}]}},
               ], type=outer_struct_type),
           ],
           names=["id", "outer"]
       )
   
       # Test round-trip of each sliced row
       for i in range(batch.num_rows):
           print(i)
           sliced = batch.slice(i, 1)
           result = rust.round_trip_record_batch(sliced)
           result.validate(full=True)
           assert result.to_pydict() == sliced.to_pydict()
           assert result.schema == sliced.schema
   
   ```
   
   When I run `pytest -v .`:
   
   ```
           # Test round-trip of each sliced row
           for i in range(batch.num_rows):
               print(i)
               sliced = batch.slice(i, 1)
   >           result = rust.round_trip_record_batch(sliced)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   E           pyo3_runtime.PanicException: assertion failed: (offset + length) 
<= self.len()
   
   tests\test_sql.py:757: PanicException
   ------------------------------------------- Captured stdout call 
--------------------------------------------
   0
   1
   ------------------------------------------- Captured stderr call 
--------------------------------------------
   
   thread '<unnamed>' (12544) panicked at 
C:\Code\arrow-rs\arrow-data\src\data.rs:581:9:
   assertion failed: (offset + length) <= self.len()
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ```
   
   **Expected behavior**
   
   The provided test should pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to