I'm looking for any advice folks may have on a generic way to document and represent expected arrow schemas as part of an interface definition.
For context, our library provides a cross-language (python, c++, rust) SDK for logging semantic multi-modal data (point clouds, images, geometric transforms, bounding boxes, etc.). Each of these primitive types has an associated arrow schema, but to date we have largely abstracted that from our users through language-native object types, and a bunch of generated code to "serialize" stuff into the arrow buffers before transmitting via our IPC. We're trying to take steps in the direction of making it easier for advanced users to write and read data from the store directly using arrow, without needing to go in-and-out of an intermediate object-oriented representation. However, doing this means documenting to users, for example: "This is the arrow schema to use when sending a point cloud with a color channel". I would love it if, eventually, the arrow project had a way of defining a spec file similar to a .proto or a .fbs, with all libraries supporting loading of a schema object by directly parsing the spec. Has anyone taken steps in this direction? The best alternative I have at the moment is to redundantly define the schema for each of the 3 languages implicitly by directly providing the code to construct a datatype instance with the correct schema. But this feels unfortunately messy and hard to maintain. Thanks, Jeremy