I think the original proposal is sufficient.

Also, it is not obvious to me how one would switch from e.g. grpc+tls to http without an explicit server location (unless both Flight servers are hosted under the same port?). So the "+" proposal seems a bit weird.


Le 12/02/2024 à 23:39, David Li a écrit :
The idea is that the client would reuse the existing connection, in which case 
the protocol and such are implicit. (If the client doesn't have a connection 
anymore, it can't use the fallback anyways.)

I suppose this has the advantage that you could "fall back" to a known hostname 
with a different protocol, but I'm not sure that always applies anyways. (Correct me if 
I'm wrong Matt, but as I recall, UCX addresses aren't hostnames but rather opaque byte 
blobs, for instance.)

If we do prefer this, to avoid overloading the hostname, there's also the 
informal convention of using + in the scheme, so it could be 
arrow-flight-fallback+grpc+tls://, arrow-flight-fallback+http://, etc.

On Mon, Feb 12, 2024, at 17:03, Joel Lubinitsky wrote:
Thanks for clarifying.

Given the relationship between these two proposals, would it also be
necessary to distinguish the scheme (or schemes) supported by the
originating Flight RPC service?

If that is the case, it may be preferred to use the "host" portion of the
URI rather than the "scheme" to denote the location of the data. In this
scenario, the host "0.0.0.0" could be used. This IP address is defined in
IETF RFC1122 [1] as "This host on this network", which seems most
consistent with the intended use-case. There are some caveats to this usage
but in my experience it's not uncommon for protocols to extend the
definition of this address in their own usage.

A benefit of this convention is that the scheme remains available in the
URI to specify the transport available. For example, the following list of
locations may be included in the response:

["grpc://0.0.0.0", "ucx://0.0.0.0", "grpc://1.2.3.4", <other_locations>...]

This would indicate that grpc and ucx transport is available from the
current service, grpc is available at 1.2.3.4, and possibly more
combinations of scheme/host.

[1] https://datatracker.ietf.org/doc/html/rfc1122#section-3.2.1.3

On Mon, Feb 12, 2024 at 2:53 PM David Li <lidav...@apache.org> wrote:

Ah, while I was thinking of it as useful for a fallback, I'm not
specifying it that way.  Better ideas for names would be appreciated.

The actual precedence has never been specified. All endpoints are
equivalent, so clients may use what is "best". For instance, with Matt
Topol's concurrent proposal, a GPU-enabled client may preferentially try
UCX endpoints while other clients may choose to ignore them entirely (e.g.
because they don't have UCX installed).

In practice the ADBC/JDBC drivers just scan the list left to right and try
each endpoint in turn for lack of a better heuristic.

On Mon, Feb 12, 2024, at 14:28, Joel Lubinitsky wrote:
Thanks for proposing this David.

I think the ability to include the Flight RPC service itself in the list
of
endpoints from which data can be fetched is a helpful addition.

The current choice of name for the URI (arrow-flight-fallback://) seems
to
imply that there is an order of precedence that should be considered in
the
list of URI’s. Specifically, as a developer receiving the list of
locations
I might assume that I should try fetching from other locations first. If
those do not succeed, I may try the original service as a fallback.

Are these the intended semantics? If so, is there a way to include the
original service in the list of locations without the implied precedence?

Thanks,
Joel

On Mon, Feb 12, 2024 at 11:52 James Duong <james.du...@improving.com
.invalid>
wrote:

This seems like a good idea, and also improves consistency with clients
that erroneously assumed that the service endpoint was always in the
list
of endpoints.

From: Antoine Pitrou <anto...@python.org>
Date: Monday, February 12, 2024 at 6:05 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Subject: Re: [DISCUSS] Flight RPC: add 'fallback' URI scheme

Hello,

This looks fine to me.

Regards

Antoine.


Le 12/02/2024 à 14:46, David Li a écrit :
Hello,

I'd like to propose a slight update to Flight RPC to make Flight SQL
work better in different deployment scenarios.  Comments on the doc
would
be appreciated:



https://docs.google.com/document/d/1g9M9FmsZhkewlT1mLibuceQO8ugI0-fqumVAXKFjVGg/edit?usp=sharing

The gist is that FlightEndpoint allows specifying either (1) a list of
concrete URIs to fetch data from or (2) no URIs, meaning to fetch from
the
Flight RPC service itself; but it would be useful to combine both
behaviors
(try these concrete URIs and fall back to the Flight RPC service itself)
without requiring the service to know its own public address.

Best,
David


Reply via email to