Hello,
One thing that was discussed in the sync call is the ability to easily pass arrays at runtime between Arrow implementations or Arrow-supporting libraries in the same process, without bearing the cost of linking to e.g. the C++ Arrow library. (for example: "Duckdb wants to provide an option to return Arrow data of result sets, but they don't like having Arrow as a dependency") One possibility would be to define a C-level protocol similar in spirit to the Python buffer protocol, which some people may be familiar with (*). The basic idea is to define a simple C struct, which is ABI-stable and describes an Arrow away adequately. The struct can be stack-allocated. Its definition can also be copied in another project (or interfaced with using a C FFI layer, depending on the language). There is no formal proposal, this message is meant to stir the discussion. Issues to work out: * Memory lifetime issues: where Python simply associates the Py_buffer with a PyObject owner (a garbage-collected Python object), we need another means to control lifetime of pointed areas. One simple possibility is to include a destructor function pointer in the protocol struct. * Arrow type representation. We probably need some kind of "format" mini-language to represent Arrow types, so that a type can be described using a `const char*`. Ideally, primitives types at least should be trivially parsable. We may take inspiration from Python here (`struct` module format characters, PEP 3118 format additions). Example C struct definition (not a formal proposal!): struct ArrowBuffer { void* data; int64_t nbytes; // Called by the consumer when it doesn't need the buffer anymore void (*release)(struct ArrowBuffer*); // Opaque user data (for e.g. the release callback) void* user_data; }; struct ArrowArray { // Type description const char* format; // Data description int64_t length; int64_t null_count; int64_t n_buffers; // Note: this pointers are probably owned by the ArrowArray struct // and will be released and free()ed by the release callback. struct BufferDescriptor* buffers; struct ArrowDescriptor* dictionary; // Called by the consumer when it doesn't need the array anymore void (*release)(struct ArrowArrayDescriptor*); // Opaque user data (for e.g. the release callback) void* user_data; }; Thoughts? (*) For the record, the reference for the Python buffer protocol: https://docs.python.org/3/c-api/buffer.html#buffer-structure and its C struct definition: https://github.com/python/cpython/blob/v3.7.4/Include/object.h#L181-L195 Regards Antoine.