My vote: +1

On Wed, Feb 28, 2024, at 15:50, Joel Lubinitsky wrote:
> +1
>
> On Wed, Feb 28, 2024 at 3:22 PM Andrew Lamb <al...@influxdata.com> wrote:
>
>> +1
>>
>>
>> On Tue, Feb 27, 2024 at 9:06 AM David Li <lidav...@apache.org> wrote:
>>
>> > I would like to propose a 'reuse connection' URI scheme for Flight RPC.
>> > This proposal was previously discussed at [1]. A candidate implementation
>> > for C++, Java, and Go is at [2].
>> >
>> > The vote will be open for at least 72 hours.
>> >
>> > [ ] +1
>> > [ ] +0
>> > [ ] -1 Do not accept this proposal because...
>> >
>> > [1]: https://lists.apache.org/thread/pc9fs0hf8t5ylj9os00r9vg8d2xv2npz
>> > [2]: https://github.com/apache/arrow/pull/40084
>> >
>> > On Tue, Feb 20, 2024, at 14:14, David Li wrote:
>> > > Thanks for the comments - I've updated the implementation [1] and added
>> > > Go + integration tests. If this all checks out I'd like to start a vote
>> > > soon.
>> > >
>> > > [1]: https://github.com/apache/arrow/pull/40084
>> > >
>> > > On Fri, Feb 16, 2024, at 13:43, Andrew Lamb wrote:
>> > >> Thank you -- I think the usecase is great, but agree with the other
>> > >> reviewers that the name may be confusing. I left some notes on the
>> > ticket
>> > >>
>> > >> Andrew
>> > >>
>> > >> On Wed, Feb 14, 2024 at 3:52 PM David Li <lidav...@apache.org> wrote:
>> > >>
>> > >>> I've put up a candidate implementation sans integration test [1].
>> > >>>
>> > >>> Some caveats:
>> > >>> - java.net.URI doesn't accept 'scheme://', only 'scheme:/' or
>> > 'scheme://?'
>> > >>> (yes, an empty query string pacifies it). I've chosen the latter
>> since
>> > the
>> > >>> former is technically a URI with a non-empty path but neither are
>> > ideal.
>> > >>> - I've changed the scheme to 'arrow-flight-reuse-connection' to be
>> more
>> > >>> faithful to the intended use than 'fallback'.
>> > >>>
>> > >>> [1]: https://github.com/apache/arrow/pull/40084
>> > >>>
>> > >>> On Tue, Feb 13, 2024, at 13:01, Jean-Baptiste Onofré wrote:
>> > >>> > Hi David,
>> > >>> >
>> > >>> > It's reasonable. I think we can start with your initial proposal
>> (it
>> > >>> > sounds fine to me) and we can always improve step by step.
>> > >>> >
>> > >>> > Thanks !
>> > >>> > Regards
>> > >>> > JB
>> > >>> >
>> > >>> > On Tue, Feb 13, 2024 at 4:53 PM David Li <lidav...@apache.org>
>> > wrote:
>> > >>> >>
>> > >>> >> I'm going to keep the proposal as-is then. It can be extended if
>> > this
>> > >>> use case comes up.
>> > >>> >>
>> > >>> >> I'll start work on candidate implementations now.
>> > >>> >>
>> > >>> >> On Tue, Feb 13, 2024, at 03:22, Antoine Pitrou wrote:
>> > >>> >> > I think the original proposal is sufficient.
>> > >>> >> >
>> > >>> >> > Also, it is not obvious to me how one would switch from e.g.
>> > grpc+tls
>> > >>> to
>> > >>> >> > http without an explicit server location (unless both Flight
>> > servers
>> > >>> are
>> > >>> >> > hosted under the same port?). So the "+" proposal seems a bit
>> > weird.
>> > >>> >> >
>> > >>> >> >
>> > >>> >> > Le 12/02/2024 à 23:39, David Li a écrit :
>> > >>> >> >> The idea is that the client would reuse the existing
>> connection,
>> > in
>> > >>> which case the protocol and such are implicit. (If the client doesn't
>> > have
>> > >>> a connection anymore, it can't use the fallback anyways.)
>> > >>> >> >>
>> > >>> >> >> I suppose this has the advantage that you could "fall back" to
>> a
>> > >>> known hostname with a different protocol, but I'm not sure that
>> always
>> > >>> applies anyways. (Correct me if I'm wrong Matt, but as I recall, UCX
>> > >>> addresses aren't hostnames but rather opaque byte blobs, for
>> instance.)
>> > >>> >> >>
>> > >>> >> >> If we do prefer this, to avoid overloading the hostname,
>> there's
>> > >>> also the informal convention of using + in the scheme, so it could be
>> > >>> arrow-flight-fallback+grpc+tls://, arrow-flight-fallback+http://,
>> etc.
>> > >>> >> >>
>> > >>> >> >> On Mon, Feb 12, 2024, at 17:03, Joel Lubinitsky wrote:
>> > >>> >> >>> Thanks for clarifying.
>> > >>> >> >>>
>> > >>> >> >>> Given the relationship between these two proposals, would it
>> > also be
>> > >>> >> >>> necessary to distinguish the scheme (or schemes) supported by
>> > the
>> > >>> >> >>> originating Flight RPC service?
>> > >>> >> >>>
>> > >>> >> >>> If that is the case, it may be preferred to use the "host"
>> > portion
>> > >>> of the
>> > >>> >> >>> URI rather than the "scheme" to denote the location of the
>> > data. In
>> > >>> this
>> > >>> >> >>> scenario, the host "0.0.0.0" could be used. This IP address is
>> > >>> defined in
>> > >>> >> >>> IETF RFC1122 [1] as "This host on this network", which seems
>> > most
>> > >>> >> >>> consistent with the intended use-case. There are some caveats
>> to
>> > >>> this usage
>> > >>> >> >>> but in my experience it's not uncommon for protocols to extend
>> > the
>> > >>> >> >>> definition of this address in their own usage.
>> > >>> >> >>>
>> > >>> >> >>> A benefit of this convention is that the scheme remains
>> > available
>> > >>> in the
>> > >>> >> >>> URI to specify the transport available. For example, the
>> > following
>> > >>> list of
>> > >>> >> >>> locations may be included in the response:
>> > >>> >> >>>
>> > >>> >> >>> ["grpc://0.0.0.0", "ucx://0.0.0.0", "grpc://1.2.3.4",
>> > >>> <other_locations>...]
>> > >>> >> >>>
>> > >>> >> >>> This would indicate that grpc and ucx transport is available
>> > from
>> > >>> the
>> > >>> >> >>> current service, grpc is available at 1.2.3.4, and possibly
>> more
>> > >>> >> >>> combinations of scheme/host.
>> > >>> >> >>>
>> > >>> >> >>> [1]
>> > https://datatracker.ietf.org/doc/html/rfc1122#section-3.2.1.3
>> > >>> >> >>>
>> > >>> >> >>> On Mon, Feb 12, 2024 at 2:53 PM David Li <lidav...@apache.org
>> >
>> > >>> wrote:
>> > >>> >> >>>
>> > >>> >> >>>> Ah, while I was thinking of it as useful for a fallback, I'm
>> > not
>> > >>> >> >>>> specifying it that way.  Better ideas for names would be
>> > >>> appreciated.
>> > >>> >> >>>>
>> > >>> >> >>>> The actual precedence has never been specified. All endpoints
>> > are
>> > >>> >> >>>> equivalent, so clients may use what is "best". For instance,
>> > with
>> > >>> Matt
>> > >>> >> >>>> Topol's concurrent proposal, a GPU-enabled client may
>> > >>> preferentially try
>> > >>> >> >>>> UCX endpoints while other clients may choose to ignore them
>> > >>> entirely (e.g.
>> > >>> >> >>>> because they don't have UCX installed).
>> > >>> >> >>>>
>> > >>> >> >>>> In practice the ADBC/JDBC drivers just scan the list left to
>> > right
>> > >>> and try
>> > >>> >> >>>> each endpoint in turn for lack of a better heuristic.
>> > >>> >> >>>>
>> > >>> >> >>>> On Mon, Feb 12, 2024, at 14:28, Joel Lubinitsky wrote:
>> > >>> >> >>>>> Thanks for proposing this David.
>> > >>> >> >>>>>
>> > >>> >> >>>>> I think the ability to include the Flight RPC service itself
>> > in
>> > >>> the list
>> > >>> >> >>>> of
>> > >>> >> >>>>> endpoints from which data can be fetched is a helpful
>> > addition.
>> > >>> >> >>>>>
>> > >>> >> >>>>> The current choice of name for the URI
>> > (arrow-flight-fallback://)
>> > >>> seems
>> > >>> >> >>>> to
>> > >>> >> >>>>> imply that there is an order of precedence that should be
>> > >>> considered in
>> > >>> >> >>>> the
>> > >>> >> >>>>> list of URI’s. Specifically, as a developer receiving the
>> > list of
>> > >>> >> >>>> locations
>> > >>> >> >>>>> I might assume that I should try fetching from other
>> locations
>> > >>> first. If
>> > >>> >> >>>>> those do not succeed, I may try the original service as a
>> > >>> fallback.
>> > >>> >> >>>>>
>> > >>> >> >>>>> Are these the intended semantics? If so, is there a way to
>> > >>> include the
>> > >>> >> >>>>> original service in the list of locations without the
>> implied
>> > >>> precedence?
>> > >>> >> >>>>>
>> > >>> >> >>>>> Thanks,
>> > >>> >> >>>>> Joel
>> > >>> >> >>>>>
>> > >>> >> >>>>> On Mon, Feb 12, 2024 at 11:52 James Duong <
>> > >>> james.du...@improving.com
>> > >>> >> >>>> .invalid>
>> > >>> >> >>>>> wrote:
>> > >>> >> >>>>>
>> > >>> >> >>>>>> This seems like a good idea, and also improves consistency
>> > with
>> > >>> clients
>> > >>> >> >>>>>> that erroneously assumed that the service endpoint was
>> > always in
>> > >>> the
>> > >>> >> >>>> list
>> > >>> >> >>>>>> of endpoints.
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> From: Antoine Pitrou <anto...@python.org>
>> > >>> >> >>>>>> Date: Monday, February 12, 2024 at 6:05 AM
>> > >>> >> >>>>>> To: dev@arrow.apache.org <dev@arrow.apache.org>
>> > >>> >> >>>>>> Subject: Re: [DISCUSS] Flight RPC: add 'fallback' URI
>> scheme
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> Hello,
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> This looks fine to me.
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> Regards
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> Antoine.
>> > >>> >> >>>>>>
>> > >>> >> >>>>>>
>> > >>> >> >>>>>> Le 12/02/2024 à 14:46, David Li a écrit :
>> > >>> >> >>>>>>> Hello,
>> > >>> >> >>>>>>>
>> > >>> >> >>>>>>> I'd like to propose a slight update to Flight RPC to make
>> > >>> Flight SQL
>> > >>> >> >>>>>> work better in different deployment scenarios.  Comments on
>> > the
>> > >>> doc
>> > >>> >> >>>> would
>> > >>> >> >>>>>> be appreciated:
>> > >>> >> >>>>>>>
>> > >>> >> >>>>>>>
>> > >>> >> >>>>>>
>> > >>> >> >>>>
>> > >>>
>> >
>> https://docs.google.com/document/d/1g9M9FmsZhkewlT1mLibuceQO8ugI0-fqumVAXKFjVGg/edit?usp=sharing
>> > >>> >> >>>>>>>
>> > >>> >> >>>>>>> The gist is that FlightEndpoint allows specifying either
>> > (1) a
>> > >>> list of
>> > >>> >> >>>>>> concrete URIs to fetch data from or (2) no URIs, meaning to
>> > >>> fetch from
>> > >>> >> >>>> the
>> > >>> >> >>>>>> Flight RPC service itself; but it would be useful to
>> combine
>> > both
>> > >>> >> >>>> behaviors
>> > >>> >> >>>>>> (try these concrete URIs and fall back to the Flight RPC
>> > service
>> > >>> itself)
>> > >>> >> >>>>>> without requiring the service to know its own public
>> address.
>> > >>> >> >>>>>>>
>> > >>> >> >>>>>>> Best,
>> > >>> >> >>>>>>> David
>> > >>> >> >>>>>>
>> > >>> >> >>>>
>> > >>>
>> >
>>

Reply via email to