With 3 binding, 1 non-binding +1 votes the proposal passes. I will clean up the PR and move it out of draft.
On Fri, Mar 1, 2024, at 16:40, Sutou Kouhei wrote: > +1 > > In <421fbc7b-f441-4b0a-8626-a8d2dfff0...@app.fastmail.com> > "[VOTE] Flight RPC: add 'fallback' URI scheme" on Tue, 27 Feb 2024 > 09:01:36 -0500, > "David Li" <lidav...@apache.org> wrote: > >> I would like to propose a 'reuse connection' URI scheme for Flight RPC. This >> proposal was previously discussed at [1]. A candidate implementation for >> C++, Java, and Go is at [2]. >> >> The vote will be open for at least 72 hours. >> >> [ ] +1 >> [ ] +0 >> [ ] -1 Do not accept this proposal because... >> >> [1]: https://lists.apache.org/thread/pc9fs0hf8t5ylj9os00r9vg8d2xv2npz >> [2]: https://github.com/apache/arrow/pull/40084 >> >> On Tue, Feb 20, 2024, at 14:14, David Li wrote: >>> Thanks for the comments - I've updated the implementation [1] and added >>> Go + integration tests. If this all checks out I'd like to start a vote >>> soon. >>> >>> [1]: https://github.com/apache/arrow/pull/40084 >>> >>> On Fri, Feb 16, 2024, at 13:43, Andrew Lamb wrote: >>>> Thank you -- I think the usecase is great, but agree with the other >>>> reviewers that the name may be confusing. I left some notes on the ticket >>>> >>>> Andrew >>>> >>>> On Wed, Feb 14, 2024 at 3:52 PM David Li <lidav...@apache.org> wrote: >>>> >>>>> I've put up a candidate implementation sans integration test [1]. >>>>> >>>>> Some caveats: >>>>> - java.net.URI doesn't accept 'scheme://', only 'scheme:/' or 'scheme://?' >>>>> (yes, an empty query string pacifies it). I've chosen the latter since the >>>>> former is technically a URI with a non-empty path but neither are ideal. >>>>> - I've changed the scheme to 'arrow-flight-reuse-connection' to be more >>>>> faithful to the intended use than 'fallback'. >>>>> >>>>> [1]: https://github.com/apache/arrow/pull/40084 >>>>> >>>>> On Tue, Feb 13, 2024, at 13:01, Jean-Baptiste Onofré wrote: >>>>> > Hi David, >>>>> > >>>>> > It's reasonable. I think we can start with your initial proposal (it >>>>> > sounds fine to me) and we can always improve step by step. >>>>> > >>>>> > Thanks ! >>>>> > Regards >>>>> > JB >>>>> > >>>>> > On Tue, Feb 13, 2024 at 4:53 PM David Li <lidav...@apache.org> wrote: >>>>> >> >>>>> >> I'm going to keep the proposal as-is then. It can be extended if this >>>>> use case comes up. >>>>> >> >>>>> >> I'll start work on candidate implementations now. >>>>> >> >>>>> >> On Tue, Feb 13, 2024, at 03:22, Antoine Pitrou wrote: >>>>> >> > I think the original proposal is sufficient. >>>>> >> > >>>>> >> > Also, it is not obvious to me how one would switch from e.g. grpc+tls >>>>> to >>>>> >> > http without an explicit server location (unless both Flight servers >>>>> are >>>>> >> > hosted under the same port?). So the "+" proposal seems a bit weird. >>>>> >> > >>>>> >> > >>>>> >> > Le 12/02/2024 à 23:39, David Li a écrit : >>>>> >> >> The idea is that the client would reuse the existing connection, in >>>>> which case the protocol and such are implicit. (If the client doesn't have >>>>> a connection anymore, it can't use the fallback anyways.) >>>>> >> >> >>>>> >> >> I suppose this has the advantage that you could "fall back" to a >>>>> known hostname with a different protocol, but I'm not sure that always >>>>> applies anyways. (Correct me if I'm wrong Matt, but as I recall, UCX >>>>> addresses aren't hostnames but rather opaque byte blobs, for instance.) >>>>> >> >> >>>>> >> >> If we do prefer this, to avoid overloading the hostname, there's >>>>> also the informal convention of using + in the scheme, so it could be >>>>> arrow-flight-fallback+grpc+tls://, arrow-flight-fallback+http://, etc. >>>>> >> >> >>>>> >> >> On Mon, Feb 12, 2024, at 17:03, Joel Lubinitsky wrote: >>>>> >> >>> Thanks for clarifying. >>>>> >> >>> >>>>> >> >>> Given the relationship between these two proposals, would it also >>>>> >> >>> be >>>>> >> >>> necessary to distinguish the scheme (or schemes) supported by the >>>>> >> >>> originating Flight RPC service? >>>>> >> >>> >>>>> >> >>> If that is the case, it may be preferred to use the "host" portion >>>>> of the >>>>> >> >>> URI rather than the "scheme" to denote the location of the data. In >>>>> this >>>>> >> >>> scenario, the host "0.0.0.0" could be used. This IP address is >>>>> defined in >>>>> >> >>> IETF RFC1122 [1] as "This host on this network", which seems most >>>>> >> >>> consistent with the intended use-case. There are some caveats to >>>>> this usage >>>>> >> >>> but in my experience it's not uncommon for protocols to extend the >>>>> >> >>> definition of this address in their own usage. >>>>> >> >>> >>>>> >> >>> A benefit of this convention is that the scheme remains available >>>>> in the >>>>> >> >>> URI to specify the transport available. For example, the following >>>>> list of >>>>> >> >>> locations may be included in the response: >>>>> >> >>> >>>>> >> >>> ["grpc://0.0.0.0", "ucx://0.0.0.0", "grpc://1.2.3.4", >>>>> <other_locations>...] >>>>> >> >>> >>>>> >> >>> This would indicate that grpc and ucx transport is available from >>>>> the >>>>> >> >>> current service, grpc is available at 1.2.3.4, and possibly more >>>>> >> >>> combinations of scheme/host. >>>>> >> >>> >>>>> >> >>> [1] https://datatracker.ietf.org/doc/html/rfc1122#section-3.2.1.3 >>>>> >> >>> >>>>> >> >>> On Mon, Feb 12, 2024 at 2:53 PM David Li <lidav...@apache.org> >>>>> wrote: >>>>> >> >>> >>>>> >> >>>> Ah, while I was thinking of it as useful for a fallback, I'm not >>>>> >> >>>> specifying it that way. Better ideas for names would be >>>>> appreciated. >>>>> >> >>>> >>>>> >> >>>> The actual precedence has never been specified. All endpoints are >>>>> >> >>>> equivalent, so clients may use what is "best". For instance, with >>>>> Matt >>>>> >> >>>> Topol's concurrent proposal, a GPU-enabled client may >>>>> preferentially try >>>>> >> >>>> UCX endpoints while other clients may choose to ignore them >>>>> entirely (e.g. >>>>> >> >>>> because they don't have UCX installed). >>>>> >> >>>> >>>>> >> >>>> In practice the ADBC/JDBC drivers just scan the list left to right >>>>> and try >>>>> >> >>>> each endpoint in turn for lack of a better heuristic. >>>>> >> >>>> >>>>> >> >>>> On Mon, Feb 12, 2024, at 14:28, Joel Lubinitsky wrote: >>>>> >> >>>>> Thanks for proposing this David. >>>>> >> >>>>> >>>>> >> >>>>> I think the ability to include the Flight RPC service itself in >>>>> the list >>>>> >> >>>> of >>>>> >> >>>>> endpoints from which data can be fetched is a helpful addition. >>>>> >> >>>>> >>>>> >> >>>>> The current choice of name for the URI (arrow-flight-fallback://) >>>>> seems >>>>> >> >>>> to >>>>> >> >>>>> imply that there is an order of precedence that should be >>>>> considered in >>>>> >> >>>> the >>>>> >> >>>>> list of URI’s. Specifically, as a developer receiving the list of >>>>> >> >>>> locations >>>>> >> >>>>> I might assume that I should try fetching from other locations >>>>> first. If >>>>> >> >>>>> those do not succeed, I may try the original service as a >>>>> fallback. >>>>> >> >>>>> >>>>> >> >>>>> Are these the intended semantics? If so, is there a way to >>>>> include the >>>>> >> >>>>> original service in the list of locations without the implied >>>>> precedence? >>>>> >> >>>>> >>>>> >> >>>>> Thanks, >>>>> >> >>>>> Joel >>>>> >> >>>>> >>>>> >> >>>>> On Mon, Feb 12, 2024 at 11:52 James Duong < >>>>> james.du...@improving.com >>>>> >> >>>> .invalid> >>>>> >> >>>>> wrote: >>>>> >> >>>>> >>>>> >> >>>>>> This seems like a good idea, and also improves consistency with >>>>> clients >>>>> >> >>>>>> that erroneously assumed that the service endpoint was always in >>>>> the >>>>> >> >>>> list >>>>> >> >>>>>> of endpoints. >>>>> >> >>>>>> >>>>> >> >>>>>> From: Antoine Pitrou <anto...@python.org> >>>>> >> >>>>>> Date: Monday, February 12, 2024 at 6:05 AM >>>>> >> >>>>>> To: dev@arrow.apache.org <dev@arrow.apache.org> >>>>> >> >>>>>> Subject: Re: [DISCUSS] Flight RPC: add 'fallback' URI scheme >>>>> >> >>>>>> >>>>> >> >>>>>> Hello, >>>>> >> >>>>>> >>>>> >> >>>>>> This looks fine to me. >>>>> >> >>>>>> >>>>> >> >>>>>> Regards >>>>> >> >>>>>> >>>>> >> >>>>>> Antoine. >>>>> >> >>>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> Le 12/02/2024 à 14:46, David Li a écrit : >>>>> >> >>>>>>> Hello, >>>>> >> >>>>>>> >>>>> >> >>>>>>> I'd like to propose a slight update to Flight RPC to make >>>>> Flight SQL >>>>> >> >>>>>> work better in different deployment scenarios. Comments on the >>>>> doc >>>>> >> >>>> would >>>>> >> >>>>>> be appreciated: >>>>> >> >>>>>>> >>>>> >> >>>>>>> >>>>> >> >>>>>> >>>>> >> >>>> >>>>> https://docs.google.com/document/d/1g9M9FmsZhkewlT1mLibuceQO8ugI0-fqumVAXKFjVGg/edit?usp=sharing >>>>> >> >>>>>>> >>>>> >> >>>>>>> The gist is that FlightEndpoint allows specifying either (1) a >>>>> list of >>>>> >> >>>>>> concrete URIs to fetch data from or (2) no URIs, meaning to >>>>> fetch from >>>>> >> >>>> the >>>>> >> >>>>>> Flight RPC service itself; but it would be useful to combine >>>>> >> >>>>>> both >>>>> >> >>>> behaviors >>>>> >> >>>>>> (try these concrete URIs and fall back to the Flight RPC service >>>>> itself) >>>>> >> >>>>>> without requiring the service to know its own public address. >>>>> >> >>>>>>> >>>>> >> >>>>>>> Best, >>>>> >> >>>>>>> David >>>>> >> >>>>>> >>>>> >> >>>> >>>>>