> > I would support doing the work necessary to get UCX (or really any other > transport) supported, even if it is a lot of work. (I'm hoping this clears > the path to supporting a Flight-to-browser transport as well; a few > projects seem to have rolled their own approaches but I think Flight itself > should really handle this, too.)
Another possible technical approach is investigating to see if coming up with a custom gRPC "channel" implementation for new transports . Searching around it seems like there were some defunct PRs trying to enable UCX as one, I didn't look closely enough at why they might have failed. On Thu, Sep 9, 2021 at 11:07 AM David Li <lidav...@apache.org> wrote: > I would support doing the work necessary to get UCX (or really any other > transport) supported, even if it is a lot of work. (I'm hoping this clears > the path to supporting a Flight-to-browser transport as well; a few > projects seem to have rolled their own approaches but I think Flight itself > should really handle this, too.) > > From what I understand, you could tunnel gRPC over UCX as Keith mentions, > or directly use UCX, which is what it sounds like you are thinking about. > One idea we had previously was to stick to gRPC for 'control plane' > methods, and support alternate protocols only for 'data plane' methods like > DoGet - this might be more manageable, depending on what you have in mind. > > In general - there's quite a bit of work here, so it would help to > separate the work into phases, and share some more detailed > design/implementation plans, to make review more manageable. (I realize of > course this is just a general interest check right now.) Just splitting > gRPC/Flight is going to take a decent amount of work, and (from what little > I understand) using UCX means choosing from various communication methods > it offers and writing a decent amount of scaffolding code, so it would be > good to establish what exactly a 'UCX' transport means. (For instance, > presumably there's no need to stick to the Protobuf-based wire format, but > what format would we use?) > > It would also be good to expand the benchmarks, to validate the > performance we get from UCX and have a way to compare it against gRPC. > Anecdotally I've found gRPC isn't quite able to saturate a connection so it > would be interesting to see what other transports can do. > > Jed - how would you see MPI and Flight interacting? As another > transport/alternative to UCX? I admit I'm not familiar with the HPC space. > > About transferring commands with data: Flight already has an app_metadata > field in various places to allow things like this, it may be interesting to > combine with the ComputeIR proposal on this mailing list, and hopefully you > & your colleagues can take a look there as well. > > -David > > On Thu, Sep 9, 2021, at 11:24, Jed Brown wrote: > > Yibo Cai <yibo....@arm.com> writes: > > > > > HPC infrastructure normally leverages RDMA for fast data transfer > among > > > storage nodes and compute nodes. Computation tasks are dispatched to > > > compute nodes with best fit resources. > > > > > > Concretely, we are investigating porting UCX as Flight transport > layer. > > > UCX is a communication framework for modern networks. [1] > > > Besides HPC usage, many projects (spark, dask, blazingsql, etc) also > > > adopt UCX to accelerate network transmission. [2][3] > > > > I'm interested in this topic and think it's important that even if the > focus is direct to UCX, that there be some thought into MPI > interoperability and support for scalable collectives. MPI considers UCX to > be an implementation detail, but the two main implementations (MPICH and > Open MPI) support it and vendor implementations are all derived from these > two. > > >