I think we should find a proper descriptive name for the "high-performance protocol", because "high-performance" is vague and context-dependent, and also spreads unnecessary confusion about existing alternatives such as regular Arrow IPC.
I would for example propose "Dissociated Arrow IPC" to stress the idea that metadata and data can be on separate transports.
Le 03/02/2024 à 00:22, Matt Topol a écrit :
Hey all, In my current work I've been experimenting and playing around with utilizing Arrow and non-cpu memory data. While the creation of the ArrowDeviceArray struct and the enhancements to the Arrow library Device abstractions were necessary, there is also a need to extend the communications specs we utilize, i.e. Flight. Currently there is no real way to utilize Arrow Flight with shared memory or with non-CPU memory (without an expensive Device -> Host copy first). To this end I've done a bunch of research and toying around and came up with a protocol to propose and a reference implementation using UCX[1]. Attached to the proposal is also a couple extensions for Flight itself to make it easier for users to still use Flight for metadata / dataset information and then point consumers elsewhere to actually retrieve the data. The idea here is that this would be a new specification for how to transport Arrow data across these high-performance transports such as UCX / libfabric / shared memory / etc. We wouldn't necessarily expose / directly add implementations of the spec to the Arrow libraries, just provide reference/example implementations. I've written the proposal up on a google doc[2] that everyone should be able to comment on. Once we get some community discussion on there, if everyone is okay with it I'd like eventually do a vote on adopting this spec and if we do, I'll then make a PR to start adding it to the Arrow documentation, etc. Anyways, thank you everyone in advance for your feedback and comments! --Matt [1]: https://github.com/openucx/ucx/ [2]: https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit?usp=sharing