This is very nice. Look forward to trying it out. One should get
performance improvements on hardware with better interconnects, so
performance just with TCP is not illustrative of all cases.
On 12/28/21 11:41 PM, David Li wrote:
Thanks for the feedback!
Collective operations: unfortunately, these look very different from the model
Flight provides (which is just RPC). If there's interest, we could consider
implementing or exposing them in the future, or looking at making sure Arrow's
IPC APIs play well with the MPI APIs.
Non-blocking operations: I was thinking about this. In Flight, this would
probably mean async APIs, which have been discussed before (either here or on
JIRA). The UCX APIs lend themselves naturally to implementing an async API and
I would like to explore this further.
Serialization: what you probably want is GetRecordBatchPayload[1] and related
functions, which is also what Flight uses as part of its zero-copy
optimizations. This function will allocate a buffer for the IPC metadata and
return that along with a list of buffers to be written; you can then
individually send the buffers. You do have to remember to add padding yourself.
(It's what this proof-of-concept does: [2])
@Antoine: sorry, I was a little loose with things there. I don't see an HTTP/2
implementation for UCX, unfortunately.
[1]:
https://github.com/apache/arrow/blob/06b10133e486ff736e657f79ffad7b029150cfcd/cpp/src/arrow/ipc/writer.h#L389
[2]:
https://github.com/lidavidm/arrow/blob/cf804e3505b6dab996d03f8fab658aea02504090/cpp/src/arrow/flight/transport/ucx/ucx_internal.cc#L341
On Tue, Dec 28, 2021, at 15:35, Antoine Pitrou wrote:
Le 28/12/2021 à 20:09, David Li a écrit :
Antoine/Micah raised the possibility of extending gRPC instead. That would
be preferable, frankly, given otherwise we'd might have to re-implement a
lot of what gRPC and HTTP2 provide by ourselves. However, the necessary
proposal stalled and was dropped without much discussion:
https://groups.google.com/g/grpc-io/c/oIbBfPVO0lY
I'm not sure whether I proposed extending gRPC :-) Is there an HTTP2
implementation above UCX? If so, we could devise a Flight
implementation over REST/HTTP2, which might also make the TCP backend
faster than with gRPC.
Regards
Antoine.