My vote: +1
On Wed, Feb 28, 2024, at 15:50, Joel Lubinitsky wrote: > +1 > > On Wed, Feb 28, 2024 at 3:22 PM Andrew Lamb <al...@influxdata.com> wrote: > >> +1 >> >> >> On Tue, Feb 27, 2024 at 9:06 AM David Li <lidav...@apache.org> wrote: >> >> > I would like to propose a 'reuse connection' URI scheme for Flight RPC. >> > This proposal was previously discussed at [1]. A candidate implementation >> > for C++, Java, and Go is at [2]. >> > >> > The vote will be open for at least 72 hours. >> > >> > [ ] +1 >> > [ ] +0 >> > [ ] -1 Do not accept this proposal because... >> > >> > [1]: https://lists.apache.org/thread/pc9fs0hf8t5ylj9os00r9vg8d2xv2npz >> > [2]: https://github.com/apache/arrow/pull/40084 >> > >> > On Tue, Feb 20, 2024, at 14:14, David Li wrote: >> > > Thanks for the comments - I've updated the implementation [1] and added >> > > Go + integration tests. If this all checks out I'd like to start a vote >> > > soon. >> > > >> > > [1]: https://github.com/apache/arrow/pull/40084 >> > > >> > > On Fri, Feb 16, 2024, at 13:43, Andrew Lamb wrote: >> > >> Thank you -- I think the usecase is great, but agree with the other >> > >> reviewers that the name may be confusing. I left some notes on the >> > ticket >> > >> >> > >> Andrew >> > >> >> > >> On Wed, Feb 14, 2024 at 3:52 PM David Li <lidav...@apache.org> wrote: >> > >> >> > >>> I've put up a candidate implementation sans integration test [1]. >> > >>> >> > >>> Some caveats: >> > >>> - java.net.URI doesn't accept 'scheme://', only 'scheme:/' or >> > 'scheme://?' >> > >>> (yes, an empty query string pacifies it). I've chosen the latter >> since >> > the >> > >>> former is technically a URI with a non-empty path but neither are >> > ideal. >> > >>> - I've changed the scheme to 'arrow-flight-reuse-connection' to be >> more >> > >>> faithful to the intended use than 'fallback'. >> > >>> >> > >>> [1]: https://github.com/apache/arrow/pull/40084 >> > >>> >> > >>> On Tue, Feb 13, 2024, at 13:01, Jean-Baptiste Onofré wrote: >> > >>> > Hi David, >> > >>> > >> > >>> > It's reasonable. I think we can start with your initial proposal >> (it >> > >>> > sounds fine to me) and we can always improve step by step. >> > >>> > >> > >>> > Thanks ! >> > >>> > Regards >> > >>> > JB >> > >>> > >> > >>> > On Tue, Feb 13, 2024 at 4:53 PM David Li <lidav...@apache.org> >> > wrote: >> > >>> >> >> > >>> >> I'm going to keep the proposal as-is then. It can be extended if >> > this >> > >>> use case comes up. >> > >>> >> >> > >>> >> I'll start work on candidate implementations now. >> > >>> >> >> > >>> >> On Tue, Feb 13, 2024, at 03:22, Antoine Pitrou wrote: >> > >>> >> > I think the original proposal is sufficient. >> > >>> >> > >> > >>> >> > Also, it is not obvious to me how one would switch from e.g. >> > grpc+tls >> > >>> to >> > >>> >> > http without an explicit server location (unless both Flight >> > servers >> > >>> are >> > >>> >> > hosted under the same port?). So the "+" proposal seems a bit >> > weird. >> > >>> >> > >> > >>> >> > >> > >>> >> > Le 12/02/2024 à 23:39, David Li a écrit : >> > >>> >> >> The idea is that the client would reuse the existing >> connection, >> > in >> > >>> which case the protocol and such are implicit. (If the client doesn't >> > have >> > >>> a connection anymore, it can't use the fallback anyways.) >> > >>> >> >> >> > >>> >> >> I suppose this has the advantage that you could "fall back" to >> a >> > >>> known hostname with a different protocol, but I'm not sure that >> always >> > >>> applies anyways. (Correct me if I'm wrong Matt, but as I recall, UCX >> > >>> addresses aren't hostnames but rather opaque byte blobs, for >> instance.) >> > >>> >> >> >> > >>> >> >> If we do prefer this, to avoid overloading the hostname, >> there's >> > >>> also the informal convention of using + in the scheme, so it could be >> > >>> arrow-flight-fallback+grpc+tls://, arrow-flight-fallback+http://, >> etc. >> > >>> >> >> >> > >>> >> >> On Mon, Feb 12, 2024, at 17:03, Joel Lubinitsky wrote: >> > >>> >> >>> Thanks for clarifying. >> > >>> >> >>> >> > >>> >> >>> Given the relationship between these two proposals, would it >> > also be >> > >>> >> >>> necessary to distinguish the scheme (or schemes) supported by >> > the >> > >>> >> >>> originating Flight RPC service? >> > >>> >> >>> >> > >>> >> >>> If that is the case, it may be preferred to use the "host" >> > portion >> > >>> of the >> > >>> >> >>> URI rather than the "scheme" to denote the location of the >> > data. In >> > >>> this >> > >>> >> >>> scenario, the host "0.0.0.0" could be used. This IP address is >> > >>> defined in >> > >>> >> >>> IETF RFC1122 [1] as "This host on this network", which seems >> > most >> > >>> >> >>> consistent with the intended use-case. There are some caveats >> to >> > >>> this usage >> > >>> >> >>> but in my experience it's not uncommon for protocols to extend >> > the >> > >>> >> >>> definition of this address in their own usage. >> > >>> >> >>> >> > >>> >> >>> A benefit of this convention is that the scheme remains >> > available >> > >>> in the >> > >>> >> >>> URI to specify the transport available. For example, the >> > following >> > >>> list of >> > >>> >> >>> locations may be included in the response: >> > >>> >> >>> >> > >>> >> >>> ["grpc://0.0.0.0", "ucx://0.0.0.0", "grpc://1.2.3.4", >> > >>> <other_locations>...] >> > >>> >> >>> >> > >>> >> >>> This would indicate that grpc and ucx transport is available >> > from >> > >>> the >> > >>> >> >>> current service, grpc is available at 1.2.3.4, and possibly >> more >> > >>> >> >>> combinations of scheme/host. >> > >>> >> >>> >> > >>> >> >>> [1] >> > https://datatracker.ietf.org/doc/html/rfc1122#section-3.2.1.3 >> > >>> >> >>> >> > >>> >> >>> On Mon, Feb 12, 2024 at 2:53 PM David Li <lidav...@apache.org >> > >> > >>> wrote: >> > >>> >> >>> >> > >>> >> >>>> Ah, while I was thinking of it as useful for a fallback, I'm >> > not >> > >>> >> >>>> specifying it that way. Better ideas for names would be >> > >>> appreciated. >> > >>> >> >>>> >> > >>> >> >>>> The actual precedence has never been specified. All endpoints >> > are >> > >>> >> >>>> equivalent, so clients may use what is "best". For instance, >> > with >> > >>> Matt >> > >>> >> >>>> Topol's concurrent proposal, a GPU-enabled client may >> > >>> preferentially try >> > >>> >> >>>> UCX endpoints while other clients may choose to ignore them >> > >>> entirely (e.g. >> > >>> >> >>>> because they don't have UCX installed). >> > >>> >> >>>> >> > >>> >> >>>> In practice the ADBC/JDBC drivers just scan the list left to >> > right >> > >>> and try >> > >>> >> >>>> each endpoint in turn for lack of a better heuristic. >> > >>> >> >>>> >> > >>> >> >>>> On Mon, Feb 12, 2024, at 14:28, Joel Lubinitsky wrote: >> > >>> >> >>>>> Thanks for proposing this David. >> > >>> >> >>>>> >> > >>> >> >>>>> I think the ability to include the Flight RPC service itself >> > in >> > >>> the list >> > >>> >> >>>> of >> > >>> >> >>>>> endpoints from which data can be fetched is a helpful >> > addition. >> > >>> >> >>>>> >> > >>> >> >>>>> The current choice of name for the URI >> > (arrow-flight-fallback://) >> > >>> seems >> > >>> >> >>>> to >> > >>> >> >>>>> imply that there is an order of precedence that should be >> > >>> considered in >> > >>> >> >>>> the >> > >>> >> >>>>> list of URI’s. Specifically, as a developer receiving the >> > list of >> > >>> >> >>>> locations >> > >>> >> >>>>> I might assume that I should try fetching from other >> locations >> > >>> first. If >> > >>> >> >>>>> those do not succeed, I may try the original service as a >> > >>> fallback. >> > >>> >> >>>>> >> > >>> >> >>>>> Are these the intended semantics? If so, is there a way to >> > >>> include the >> > >>> >> >>>>> original service in the list of locations without the >> implied >> > >>> precedence? >> > >>> >> >>>>> >> > >>> >> >>>>> Thanks, >> > >>> >> >>>>> Joel >> > >>> >> >>>>> >> > >>> >> >>>>> On Mon, Feb 12, 2024 at 11:52 James Duong < >> > >>> james.du...@improving.com >> > >>> >> >>>> .invalid> >> > >>> >> >>>>> wrote: >> > >>> >> >>>>> >> > >>> >> >>>>>> This seems like a good idea, and also improves consistency >> > with >> > >>> clients >> > >>> >> >>>>>> that erroneously assumed that the service endpoint was >> > always in >> > >>> the >> > >>> >> >>>> list >> > >>> >> >>>>>> of endpoints. >> > >>> >> >>>>>> >> > >>> >> >>>>>> From: Antoine Pitrou <anto...@python.org> >> > >>> >> >>>>>> Date: Monday, February 12, 2024 at 6:05 AM >> > >>> >> >>>>>> To: dev@arrow.apache.org <dev@arrow.apache.org> >> > >>> >> >>>>>> Subject: Re: [DISCUSS] Flight RPC: add 'fallback' URI >> scheme >> > >>> >> >>>>>> >> > >>> >> >>>>>> Hello, >> > >>> >> >>>>>> >> > >>> >> >>>>>> This looks fine to me. >> > >>> >> >>>>>> >> > >>> >> >>>>>> Regards >> > >>> >> >>>>>> >> > >>> >> >>>>>> Antoine. >> > >>> >> >>>>>> >> > >>> >> >>>>>> >> > >>> >> >>>>>> Le 12/02/2024 à 14:46, David Li a écrit : >> > >>> >> >>>>>>> Hello, >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> I'd like to propose a slight update to Flight RPC to make >> > >>> Flight SQL >> > >>> >> >>>>>> work better in different deployment scenarios. Comments on >> > the >> > >>> doc >> > >>> >> >>>> would >> > >>> >> >>>>>> be appreciated: >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> >> > >>> >> >>>>>> >> > >>> >> >>>> >> > >>> >> > >> https://docs.google.com/document/d/1g9M9FmsZhkewlT1mLibuceQO8ugI0-fqumVAXKFjVGg/edit?usp=sharing >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> The gist is that FlightEndpoint allows specifying either >> > (1) a >> > >>> list of >> > >>> >> >>>>>> concrete URIs to fetch data from or (2) no URIs, meaning to >> > >>> fetch from >> > >>> >> >>>> the >> > >>> >> >>>>>> Flight RPC service itself; but it would be useful to >> combine >> > both >> > >>> >> >>>> behaviors >> > >>> >> >>>>>> (try these concrete URIs and fall back to the Flight RPC >> > service >> > >>> itself) >> > >>> >> >>>>>> without requiring the service to know its own public >> address. >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> Best, >> > >>> >> >>>>>>> David >> > >>> >> >>>>>> >> > >>> >> >>>> >> > >>> >> > >>