nvictus opened a new issue, #8913: URL: https://github.com/apache/arrow-rs/issues/8913
cc @kylebarron **Which part is this question about** Code base and/or documentation. **Describe your question** Should arrow-rs intercept panics before crossing the FFI boundary in the C bindings? If not, should it be a documented policy? **Additional context** Panics crossing foreign function interfaces is undefined behavior, usually causing the process to abort. An obvious example where this is undesirable is in bindings to computational kernel processes in Jupyter. For example, when binding Rust code to Python directly with PyO3, panics are caught and converted into [PanicExceptions](https://pyo3.rs/main/doc/pyo3/panic/struct.panicexception), allowing them to propagate all the way up the stack. I recently encountered an initially surprising [issue](https://github.com/kylebarron/arro3/issues/460) exporting `RecordBatchReader`s to Python using `arro3`: panics were converted to PanicErrors as expected when iterating using the exported Python object directly, but when the object was passed to PyArrow as a capsule, it would cause a process abort. Basically: ```python reader = create_panicky_reader() reader.read_next_batch() # raises PanicError import pyarrow as pa reader = pa.RecordBatchReader.from_stream(create_panicky_reader()) reader.read_next_batch() # Abort! ``` My understanding is that's because PyArrow is accessing the reader from native code using the C stream API, i.e. [here](https://github.com/apache/arrow-rs/blob/main/arrow-array/src/ffi_stream.rs#L236). In principle, panics could be intercepted at the interface and returned as errors to the caller via this API. My current workaround is to patch our `RecordBatchReader`s in Rust to catch panics and return them as `ArrowError`. I suppose this is fine, but then maybe it's worth documenting that it's the implementor's responsibility to place guardrails if abort-on-panic is not acceptable behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
