Thank you both for the prompt response. Just to check I understand, Antoine, your recommendation is:
1. Rust implementation should expose the ABI 2. Rust implementation should be able to consume (and use) the ABI (without owning it, but still call the `release`) And, likewise, 1. C/Pyarrow should expose this ABI 2. C/Pyarrow should be able to create an array from this ABI If yes, I agree with you that this is the way to go, especially due to how it handles alloc/dealloc (passing a pointer to the free). I would be willing to help with this, if others agree with it. However, I would need someone to mentor this, as I am outside my comfort zone wrt FFI and C ABIs. >From Rust's end, I think that we need to declare a new struct that is #[Repr(C)], and write some functions that convert it from/to `ArrayData`, which is the struct that stores this data. Furthermore, we need to cater for what Jörn pointed out about the `typed_data`, that requires alignment. I am less certain about the following: 1. in pyarrow, I was only able to find Array.from_buffers and from_pandas. Is the ABI implemented but not documented? 2. in pyarrow, I was unable to find Array.to_abi() or equivalent. Is the ABI implemented but not documented? 3. do we have a place in the project where we test these things (maybe integration?). IMO we need to compile both projects and have both communicate in the same process. I have been doing this via Python (pypi pyarrow and pyo3 for rust), but for this both need to be compiled from master. 4. Is there a "source of truth" that we can use to generate and consume these in-memory structs, e.g. to perform round-trips. 5. if an implementation for C/pyarrow is required, is there anyone willing to pair up to help on that side? I am not familiar with that code-base. Finally and most importantly, are there concerns/objections to this? Personally, I think that it would be awesome to have C and Rust be able to share Arrow arrays back and forth through pointers. Best, Jorge On Tue, Sep 22, 2020 at 7:20 PM Antoine Pitrou <anto...@python.org> wrote: > > Le 22/09/2020 à 19:16, Jorge Cardoso Leitão a écrit : > > Hi, > > > > I had some time to look at > https://issues.apache.org/jira/browse/ARROW-10039, > > wrt to the alignment requirements that rust implementation currently > > imposes. > > > > The gist is that it is not that easy, and I would like to request some > > guidance. > > > > Some facts: > > 1. Our current implementation does not accept a pointer if that pointer > is > > not memory aligned (we panic) > > 2. Our rust implementation's alignment is a static/const that depends on > > the architecture and therefore constant throughout the program > > 3. Rust alloc/dealloc both require an argument denoting memory alignment. > > 4. calling dealloc with the wrong alignment is undefined behavior > > > > 3-4 imply that removing our safeguard against unaligned memory (wrt to > the > > constant alignment) leads to undefined behavior: we take ownership of a > > pointer with an alignment X != our alignment and when we try to free it, > we > > enter undefined world. > > If you are given a foreign pointer (e.g. coming from Python or C++), you > should simply never deallocate it yourself. You don't know which > allocator gave you the pointer, and it's probably not the Rust allocator > (so it can't manage the pointer anyway). > > What you should do is call the destructor, if any, that comes with the > buffer pointer. > > I'll note again that the C data interface addresses those issues ;-) > > Regards > > Antoine. >