>
> I would support doing the work necessary to get UCX (or really any other
> transport) supported, even if it is a lot of work. (I'm hoping this clears
> the path to supporting a Flight-to-browser transport as well; a few
> projects seem to have rolled their own approaches but I think Flight itself
> should really handle this, too.)


Another possible technical approach is investigating to see if coming up
with a  custom gRPC "channel" implementation for new transports .
 Searching around it seems like there were some defunct PRs trying to
enable UCX as one, I didn't look closely enough at why they might have
failed.

On Thu, Sep 9, 2021 at 11:07 AM David Li <lidav...@apache.org> wrote:

> I would support doing the work necessary to get UCX (or really any other
> transport) supported, even if it is a lot of work. (I'm hoping this clears
> the path to supporting a Flight-to-browser transport as well; a few
> projects seem to have rolled their own approaches but I think Flight itself
> should really handle this, too.)
>
> From what I understand, you could tunnel gRPC over UCX as Keith mentions,
> or directly use UCX, which is what it sounds like you are thinking about.
> One idea we had previously was to stick to gRPC for 'control plane'
> methods, and support alternate protocols only for 'data plane' methods like
> DoGet - this might be more manageable, depending on what you have in mind.
>
> In general - there's quite a bit of work here, so it would help to
> separate the work into phases, and share some more detailed
> design/implementation plans, to make review more manageable. (I realize of
> course this is just a general interest check right now.) Just splitting
> gRPC/Flight is going to take a decent amount of work, and (from what little
> I understand) using UCX means choosing from various communication methods
> it offers and writing a decent amount of scaffolding code, so it would be
> good to establish what exactly a 'UCX' transport means. (For instance,
> presumably there's no need to stick to the Protobuf-based wire format, but
> what format would we use?)
>
> It would also be good to expand the benchmarks, to validate the
> performance we get from UCX and have a way to compare it against gRPC.
> Anecdotally I've found gRPC isn't quite able to saturate a connection so it
> would be interesting to see what other transports can do.
>
> Jed - how would you see MPI and Flight interacting? As another
> transport/alternative to UCX? I admit I'm not familiar with the HPC space.
>
> About transferring commands with data: Flight already has an app_metadata
> field in various places to allow things like this, it may be interesting to
> combine with the ComputeIR proposal on this mailing list, and hopefully you
> & your colleagues can take a look there as well.
>
> -David
>
> On Thu, Sep 9, 2021, at 11:24, Jed Brown wrote:
> > Yibo Cai <yibo....@arm.com> writes:
> >
> > > HPC infrastructure normally leverages RDMA for fast data transfer
> among
> > > storage nodes and compute nodes. Computation tasks are dispatched to
> > > compute nodes with best fit resources.
> > >
> > > Concretely, we are investigating porting UCX as Flight transport
> layer.
> > > UCX is a communication framework for modern networks. [1]
> > > Besides HPC usage, many projects (spark, dask, blazingsql, etc) also
> > > adopt UCX to accelerate network transmission. [2][3]
> >
> > I'm interested in this topic and think it's important that even if the
> focus is direct to UCX, that there be some thought into MPI
> interoperability and support for scalable collectives. MPI considers UCX to
> be an implementation detail, but the two main implementations (MPICH and
> Open MPI) support it and vendor implementations are all derived from these
> two.
> >
>

Reply via email to