kylebarron opened a new issue, #6151: URL: https://github.com/apache/arrow-rs/issues/6151
**Describe the bug** In https://github.com/kylebarron/arro3 I'm exporting arrow-rs functionality for general Python use. I seem to have hit a bug importing sliced arrays. In [`import_array_pycapsules`](https://github.com/kylebarron/arro3/blob/9673b62161f94d41a06c71f895abb8662b8e2864/pyo3-arrow/src/ffi/from_python/utils.rs#L72-L89) (which is vendored from arrow-rs code [here](https://github.com/apache/arrow-rs/blob/80ed7128510bac114c6feec08c34ef3beed3a44a/arrow/src/pyarrow.rs#L265-L275)) I have: ```rs pub(crate) fn import_array_pycapsules( schema_capsule: &Bound<PyCapsule>, array_capsule: &Bound<PyCapsule>, ) -> PyResult<(ArrayRef, Field)> { validate_pycapsule_name(schema_capsule, "arrow_schema")?; validate_pycapsule_name(array_capsule, "arrow_array")?; let schema_ptr = unsafe { schema_capsule.reference::<FFI_ArrowSchema>() }; let array = unsafe { FFI_ArrowArray::from_raw(array_capsule.pointer() as _) }; let array_data = unsafe { arrow::ffi::from_ffi(array, schema_ptr) } .map_err(|err| PyTypeError::new_err(err.to_string()))?; dbg!(array_data.offset()); let field = Field::try_from(schema_ptr).map_err(|err| PyTypeError::new_err(err.to_string()))?; let array = make_array(array_data); dbg!(array.offset()); Ok((array, field)) } ``` Note the two `dbg!` macros. When invoked from Python with a pyarrow `StructArray`, the array offset is lost. ``` import pyarrow as pa import pytest from arro3.compute import struct_field a = pa.array([1, 2, 3]) b = pa.array([3, 4, 5]) struct_arr = pa.StructArray.from_arrays([a, b], names=["a", "b"]) sliced = struct_arr.slice(1, 2) sliced.offset # 1 pa.array(struct_field(sliced, [0])) # <pyarrow.lib.Int64Array object at 0x10fa94700> # [ # 1, # 2 # ] ``` Note that the _first two_ elements of `a` are kept, with the `offset` not used. I've isolated this to the two lines with `dbg!`. Those print: ``` [pyo3-arrow/src/ffi/from_python/utils.rs:84:5] array_data.offset() = 1 [pyo3-arrow/src/ffi/from_python/utils.rs:87:5] array.offset() = 0 ``` In particular `make_array` does not check the `offset` from the base array. **To Reproduce** Here's the way to reproduce the upstream bug ``` git clone https://github.com/kylebarron/arro3 cd arro3 git checkout 9673b62 poetry install poetry run maturin develop -m arro3-core/Cargo.toml poetry run maturin develop -m arro3-compute/Cargo.toml poetry run pytest ``` I can _try_ to reproduce this in pure rust if needed, but that may not be possible because the `StructArray` seems to always export an `offset` of `0`, and so it may not be easy to reproduce this importing behavior. **Expected behavior** Expected the array offset to be maintained. **Additional context** <!-- Add any other context about the problem here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
