Related: Gandiva invented its own particular way of passing memory addresses through the JNI boundary rather than using Flatbuffers messages
https://github.com/apache/arrow/blob/master/cpp/src/gandiva/jni/jni_common.cc#L505 I'm all for language-agnostic in-memory data passing, but there is a use case for a C API to pass pointers at call sites while avoiding flattening (disassembly) and unflattening (reassembly) steps. On Thu, Oct 3, 2019 at 4:34 AM Antoine Pitrou <anto...@python.org> wrote: > > > Hi Jacques, > > Le 03/10/2019 à 02:46, Jacques Nadeau a écrit : > > > > I think it is reasonable to argue that keeping any ABI (or header/struct > > pattern) as narrow as possible would allow us to minimize overlap with the > > existing in-memory specification. In Arrow's case, this could be as simple > > as a single memory pointer for schema (backed by flatbuffers) and a single > > memory location for data (that references the record batch header, which in > > turn provides pointers into the actual arrow data). [...] > > > > [...] (For example, in a JVM > > view of the world, working with a plain struct in java rather than a set of > > memory pointers against our existing IPC formats would be quite painful and > > we'd definitely need to create some glue code for users. I worry the same > > pattern would occur in many other languages.) > > I'm trying to understand the point you're making. Here you say that it > was difficult for the JVM to deal with raw pointers. But above you seem > to argue for a flatbuffers-based serialization containing raw pointers. > > Here's another way to frame the question: how do you propose to do > zero-copy between different languages if not by passing raw pointers to > the Arrow data? And if passing raw pointers is acceptable, what is > wrong with the spec as proposed? > > > As for creating glue code: yes, of course, that would be needed in most > languages that want to provide this interface (including C++). You do > need a C FFI for that. I'm quite sure it would be possible to implement > this proposal in pure Python with ctypes / cffi, for example (as a toy > example, since PyArrow exists :-)). When writing the spec, I also took > a look at the Go and Rust FFIs, and they seem good enough to interact > with it. I tried to take a look at JNI, but of course I got lost in the > documentation :-) > > If you are worried that people start thinking that this proposal is part > of the Arrow specification, perhaps we can make it clear that exposing > this interface is optional for implementations. > > Regards > > Antoine.