Hey Andy,

I've been rather busy unfortunately. I had started on an
implementation in C++ to provide as part of this discussion, but it's
not complete. I'm hoping to have more done in March.

Best,
David

On 2/25/20, Andy Grove <andygrov...@gmail.com> wrote:
> I was wondering if there had been any momentum on this (the BiDirectional
> RPC design)?
>
> I'm interested in this for the use case of Apache Spark sending a stream of
> data to another process to invoke custom code and then receive a stream
> back with the transformed data.
>
> Thanks,
>
> Andy.
>
>
>
> On Fri, Dec 13, 2019 at 12:12 PM Jacques Nadeau <jacq...@apache.org> wrote:
>
>> I support moving forward with the current proposal.
>>
>> On Thu, Dec 12, 2019 at 12:20 PM David Li <li.david...@gmail.com> wrote:
>>
>> > Just following up here again, any other thoughts?
>> >
>> > I think we do have justifications for potentially separate streams in
>> > a call, but that's more of an orthogonal question - it doesn't need to
>> > be addressed here. I do agree that it very much complicates things.
>> >
>> > Thanks,
>> > David
>> >
>> > On 11/29/19, Wes McKinney <wesmck...@gmail.com> wrote:
>> > > I would generally agree with this. Note that you have the possibility
>> > > to use unions-of-structs to send record batches with different
>> > > schemas
>> > > in the same stream, though with some added complexity on each side
>> > >
>> > > On Thu, Nov 28, 2019 at 10:37 AM Jacques Nadeau <jacq...@apache.org>
>> > wrote:
>> > >>
>> > >> I'd vote for explicitly not supported. We should keep our primitives
>> > >> narrow.
>> > >>
>> > >> On Wed, Nov 27, 2019, 1:17 PM David Li <li.david...@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > Thanks for the feedback.
>> > >> >
>> > >> > I do think if we had explicitly embraced gRPC from the beginning,
>> > >> > there are a lot of places where things could be made more
>> > >> > ergonomic,
>> > >> > including with the metadata fields. But it would also have locked
>> out
>> > >> > us of potential future transports.
>> > >> >
>> > >> > On another note: I hesitate to put too much into this method, but
>> > >> > we
>> > >> > are looking at use cases where potentially, a client may want to
>> > >> > upload multiple distinct datasets (with differing schemas). (This
>> is a
>> > >> > little tentative, and I can get more details...) Right now, each
>> > >> > logical stream in Flight must have a single, consistent schema;
>> would
>> > >> > it make sense to look at ways to relax this, or declare this
>> > >> > explicitly out of scope (and require multiple calls and
>> > >> > coordination
>> > >> > with the deployment topology) in order to accomplish this?
>> > >> >
>> > >> > Best,
>> > >> > David
>> > >> >
>> > >> > On 11/27/19, Jacques Nadeau <jacq...@apache.org> wrote:
>> > >> > > Fair enough. I'm okay with the bytes approach and the proposal
>> looks
>> > >> > > good
>> > >> > > to me.
>> > >> > >
>> > >> > > On Fri, Nov 8, 2019 at 11:37 AM David Li <li.david...@gmail.com>
>> > >> > > wrote:
>> > >> > >
>> > >> > >> I've updated the proposal.
>> > >> > >>
>> > >> > >> On the subject of Protobuf Any vs bytes, and how to handle
>> > >> > >> errors/metadata, I still think using bytes is preferable:
>> > >> > >> - It doesn't require (conditionally) exposing or wrapping
>> Protobuf
>> > >> > types,
>> > >> > >> - We wouldn't be able to practically expose the Protobuf field
>> > >> > >> to
>> > >> > >> C++
>> > >> > >> users without causing build pains,
>> > >> > >> - We can't let Python users take advantage of the Protobuf
>> > >> > >> field
>> > >> > >> without somehow being compatible with the Protobuf wheels (by
>> > >> > >> linking
>> > >> > >> to the same version, and doing magic to turn the C++ Protobufs
>> into
>> > >> > >> the Python ones),
>> > >> > >> - All our other application-defined fields are already bytes.
>> > >> > >>
>> > >> > >> Applications that want structure can encode JSON or Protobuf
>> > >> > >> Any
>> > >> > >> into
>> > >> > >> the bytes field themselves, much as you can already do for
>> Ticket,
>> > >> > >> commands in FlightDescriptors, and application metadata in
>> > >> > >> DoGet/DoPut. I don't think this is (much) less efficient than
>> using
>> > >> > >> Any directly, since Any itself is a bytes field with a tag, and
>> > must
>> > >> > >> invoke the Protobuf deserializer again to read the actual
>> message.
>> > >> > >>
>> > >> > >> If we decide on using bytes, then I don't think it makes sense
>> > >> > >> to
>> > >> > >> define a new message with a oneof either, since it would be
>> > >> > >> redundant.
>> > >> > >>
>> > >> > >> Thanks,
>> > >> > >> David
>> > >> > >>
>> > >> > >> On 11/7/19, David Li <li.david...@gmail.com> wrote:
>> > >> > >> > I've been extremely backlogged, I will update the proposal
>> when I
>> > >> > >> > get
>> > >> > >> > a chance and reply here when done.
>> > >> > >> >
>> > >> > >> > Best,
>> > >> > >> > David
>> > >> > >> >
>> > >> > >> > On 11/7/19, Wes McKinney <wesmck...@gmail.com> wrote:
>> > >> > >> >> Bumping this discussion since a couple of weeks have passed.
>> It
>> > >> > >> >> seems
>> > >> > >> >> there are still some questions here, could we summarize what
>> are
>> > >> > >> >> the
>> > >> > >> >> alternatives along with any public API implications so we
>> > >> > >> >> can
>> > try
>> > >> > >> >> to
>> > >> > >> >> render a decision?
>> > >> > >> >>
>> > >> > >> >> On Sat, Oct 26, 2019 at 7:19 PM David Li <
>> li.david...@gmail.com
>> > >
>> > >> > >> >> wrote:
>> > >> > >> >>>
>> > >> > >> >>> Hi Wes,
>> > >> > >> >>>
>> > >> > >> >>> Responses inline:
>> > >> > >> >>>
>> > >> > >> >>> On Sat, Oct 26, 2019, 13:46 Wes McKinney <
>> wesmck...@gmail.com>
>> > >> > wrote:
>> > >> > >> >>>
>> > >> > >> >>> > On Mon, Oct 21, 2019 at 7:40 PM David Li
>> > >> > >> >>> > <li.david...@gmail.com>
>> > >> > >> >>> > wrote:
>> > >> > >> >>> > >
>> > >> > >> >>> > > The question is whether to repurpose the existing
>> > FlightData
>> > >> > >> >>> > > structure, and allow for the metadata field to be
>> > >> > >> >>> > > filled
>> in
>> > >> > >> >>> > > and
>> > >> > >> data
>> > >> > >> >>> > > fields to be blank (as a control message), or to wrap
>> > >> > >> >>> > > the
>> > >> > >> FlightData
>> > >> > >> >>> > > structure in another structure that explicitly
>> > distinguishes
>> > >> > >> between
>> > >> > >> >>> > > control and data messages.
>> > >> > >> >>> >
>> > >> > >> >>> > I'm not super against having metadata-only FlightData
>> > >> > >> >>> > with
>> > >> > >> >>> > empty
>> > >> > >> body.
>> > >> > >> >>> > One question to consider is what changes (if any) would
>> need
>> > to
>> > >> > >> >>> > be
>> > >> > >> >>> > made to public APIs in either scenario.
>> > >> > >> >>> >
>> > >> > >> >>>
>> > >> > >> >>> We could leave DoGet/DoPut as-is for now, and allow empty
>> data
>> > >> > >> >>> messages
>> > >> > >> >>> in
>> > >> > >> >>> the future. This would be a breaking change, but wouldn't
>> > change
>> > >> > >> >>> the
>> > >> > >> >>> wire
>> > >> > >> >>> format. I think the APIs could be changed backwards
>> compatibly,
>> > >> > >> >>> though.
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>> > > The other question is how to handle the metadata
>> > >> > >> >>> > > fields.
>> So
>> > >> > >> >>> > > far,
>> > >> > >> >>> > > we've
>> > >> > >> >>> > > used bytestring fields for application-defined data.
>> > >> > >> >>> > > This
>> > is
>> > >> > >> >>> > > workable
>> > >> > >> >>> > > if you want to use Protobuf to define the contents of
>> those
>> > >> > >> >>> > > fields,
>> > >> > >> >>> > > but requires you to pack/unpack your Protobuf into/from
>> the
>> > >> > >> >>> > > bytestring
>> > >> > >> >>> > > field. If we instead used the Protobuf Any field, a
>> > >> > >> >>> > > dynamically
>> > >> > >> >>> > > typed
>> > >> > >> >>> > > field, this would be more convenient, but then we'd be
>> > >> > >> >>> > > exposing
>> > >> > >> >>> > > Protobuf types. We could alternatively use a
>> > >> > >> >>> > > combination
>> of
>> > >> > >> >>> > > a
>> > >> > >> >>> > > type
>> > >> > >> >>> > > field and a bytestring field, mimicking what the
>> > >> > >> >>> > > Protobuf
>> > >> > >> >>> > > Any
>> > >> > >> >>> > > type
>> > >> > >> >>> > > looks like on the wire. I'm not sure this is actually
>> > cleaner
>> > >> > >> >>> > > in
>> > >> > >> any
>> > >> > >> >>> > > of the language APIs, though.
>> > >> > >> >>> >
>> > >> > >> >>> > Leaving the deserialization of the app metadata to the
>> > >> > >> >>> > particular
>> > >> > >> >>> > Flight implementation seems on first principles like the
>> most
>> > >> > >> flexible
>> > >> > >> >>> > thing, if Any is used, does that mean the metadata _must_
>> be
>> > a
>> > >> > >> >>> > protobuf?
>> > >> > >> >>> >
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>> If Any is used, we could still expose a bytes-based API,
>> > >> > >> >>> but
>> it
>> > >> > would
>> > >> > >> >>> have
>> > >> > >> >>> some more wrapping. (We could put a ByteString in Any.)
>> > >> > >> >>> Then
>> > the
>> > >> > >> >>> question
>> > >> > >> >>> would just be how to expose this (would be easier in Java,
>> > harder
>> > >> > >> >>> in
>> > >> > >> >>> C++).
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>> > > David
>> > >> > >> >>> > >
>> > >> > >> >>> > > On 10/21/19, Antoine Pitrou <anto...@python.org> wrote:
>> > >> > >> >>> > > >
>> > >> > >> >>> > > > Can one of you explain what is being proposed in
>> > >> > >> >>> > > > non-protobuf
>> > >> > >> >>> > > > terms?
>> > >> > >> >>> > > > Knowledge of protobuf shouldn't be required to use
>> > Flight.
>> > >> > >> >>> > > >
>> > >> > >> >>> > > > Regards
>> > >> > >> >>> > > >
>> > >> > >> >>> > > > Antoine.
>> > >> > >> >>> > > >
>> > >> > >> >>> > > >
>> > >> > >> >>> > > > Le 21/10/2019 à 15:46, David Li a écrit :
>> > >> > >> >>> > > >> Oneof doesn't actually change the wire encoding; it
>> > would
>> > >> > just
>> > >> > >> be
>> > >> > >> >>> > > >> application-level logic. (The official guide doesn't
>> > even
>> > >> > >> mention
>> > >> > >> >>> > > >> it
>> > >> > >> >>> > > >> in the encoding docs; I found
>> > >> > >> >>> > > >>
>> > >> > >> >>> >
>> > >> > >>
>> > >> >
>> >
>> https://stackoverflow.com/questions/52226409/how-protobuf-encodes-oneof-message-construct
>> > >> > >> >>> > > >> as well.)
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> If I follow you, Jacques, then you are proposing
>> > >> > >> >>> > > >> essentially
>> > >> > >> >>> > > >> inlining
>> > >> > >> >>> > > >> the definition of Any, e.g.
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> message FlightMessage {
>> > >> > >> >>> > > >>   oneof message {
>> > >> > >> >>> > > >>     FlightData data = 1;
>> > >> > >> >>> > > >>     FlightAny metadata = 2;
>> > >> > >> >>> > > >>   }
>> > >> > >> >>> > > >> }
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> message FlightAny {
>> > >> > >> >>> > > >>   string type = 1;
>> > >> > >> >>> > > >>   bytes data = 2;
>> > >> > >> >>> > > >> }
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> Is this correct?
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> It might be nice to consider the wrapper message for
>> > >> > >> >>> > > >> DoGet/DoPut
>> > >> > >> >>> > > >> as
>> > >> > >> >>> > > >> well, but at that point, I'd rather we be consistent
>> > with
>> > >> > >> >>> > > >> all
>> > >> > >> >>> > > >> of
>> > >> > >> >>> > > >> them,
>> > >> > >> >>> > > >> rather than have one of the three methods do its own
>> > >> > >> >>> > > >> thing.
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> Thanks,
>> > >> > >> >>> > > >> David
>> > >> > >> >>> > > >>
>> > >> > >> >>> > > >> On 10/20/19, Jacques Nadeau <jacq...@apache.org>
>> wrote:
>> > >> > >> >>> > > >>> I think we could probably expose the oneof behavior
>> > >> > >> >>> > > >>> without
>> > >> > >> >>> > > >>> exposing
>> > >> > >> >>> > the
>> > >> > >> >>> > > >>> protobuf functions. On the any... hmm. I guess we
>> could
>> > >> > >> >>> > > >>> expose
>> > >> > >> >>> > > >>> as
>> > >> > >> >>> > > >>> two
>> > >> > >> >>> > > >>> fields: type and data. Then users could use it for
>> > >> > >> >>> > > >>> whatever
>> > >> > >> >>> > > >>> but
>> > >> > >> >>> > > >>> if
>> > >> > >> >>> > > >>> people
>> > >> > >> >>> > > >>> wanted to treat it as any, it would work.
>> > >> > >> >>> > > >>> (Basically
>> a
>> > >> > >> >>> > > >>> user
>> > >> > >> >>> > > >>> could
>> > >> > >> >>> > > >>> use
>> > >> > >> >>> > > >>> any
>> > >> > >> >>> > > >>> with it easily but they could also use any other
>> > >> > >> >>> > > >>> mechanism).
>> > >> > >> >>> > > >>> At
>> > >> > >> >>> > least in
>> > >> > >> >>> > > >>> java, the any concepts are pretty simple/diy. Are
>> other
>> > >> > >> language
>> > >> > >> >>> > > >>> bindings
>> > >> > >> >>> > > >>> less diy?
>> > >> > >> >>> > > >>>
>> > >> > >> >>> > > >>> I'm *not* hardcore against the empty FlightData +
>> > >> > >> >>> > > >>> metadata
>> > >> > >> >>> > > >>> but
>> > >> > >> >>> > > >>> it
>> > >> > >> >>> > just
>> > >> > >> >>> > > >>> seemed a bit janky.
>> > >> > >> >>> > > >>>
>> > >> > >> >>> > > >>> Thinking about the control message/wrapper object
>> > thing,
>> > >> > >> >>> > > >>> I
>> > >> > >> >>> > > >>> wonder
>> > >> > >> >>> > > >>> if
>> > >> > >> >>> > we
>> > >> > >> >>> > > >>> should redefine DoPut and DoGet to have the same
>> > property
>> > >> > >> >>> > > >>> if
>> > >> > >> >>> > > >>> we
>> > >> > >> >>> > think it
>> > >> > >> >>> > > >>> is
>> > >> > >> >>> > > >>> a good idea...
>> > >> > >> >>> > > >>>
>> > >> > >> >>> > > >>> On Wed, Oct 16, 2019 at 5:13 PM David Li <
>> > >> > >> li.david...@gmail.com>
>> > >> > >> >>> > wrote:
>> > >> > >> >>> > > >>>
>> > >> > >> >>> > > >>>> I was definitely considering having control
>> > >> > >> >>> > > >>>> messages
>> > >> > without
>> > >> > >> >>> > > >>>> data,
>> > >> > >> >>> > and
>> > >> > >> >>> > > >>>> I thought that could be encoded by a FlightData
>> > >> > >> >>> > > >>>> with
>> > >> > >> >>> > > >>>> only
>> > >> > >> >>> > app_metadata
>> > >> > >> >>> > > >>>> set. I think I understand your position now:
>> > FlightData
>> > >> > >> >>> > > >>>> should
>> > >> > >> >>> > always
>> > >> > >> >>> > > >>>> carry (some) data (with optional metadata)?
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> > > >>>> That makes sense to me, and is consistent with the
>> > >> > >> >>> > > >>>> documentation
>> > >> > >> >>> > > >>>> on
>> > >> > >> >>> > > >>>> FlightData in the Protobuf file. I was worried
>> > >> > >> >>> > > >>>> about
>> > >> > >> >>> > > >>>> having
>> > >> > >> >>> > > >>>> a
>> > >> > >> >>> > > >>>> redundant metadata field, but oneof prevents that
>> from
>> > >> > >> >>> > > >>>> happening,
>> > >> > >> >>> > and
>> > >> > >> >>> > > >>>> overall having a clear separation between data and
>> > >> > >> >>> > > >>>> control
>> > >> > >> >>> > > >>>> messages
>> > >> > >> >>> > is
>> > >> > >> >>> > > >>>> cleaner.
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> > > >>>> As for using Protobuf's Any: so far, we've
>> > >> > >> >>> > > >>>> refrained
>> > >> > >> >>> > > >>>> from
>> > >> > >> >>> > > >>>> exposing
>> > >> > >> >>> > > >>>> Protobuf by using bytes, would we want to change
>> that
>> > >> > >> >>> > > >>>> now?
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> > > >>>> Best,
>> > >> > >> >>> > > >>>> David
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> > > >>>> On 10/16/19, Jacques Nadeau <jacq...@apache.org>
>> > wrote:
>> > >> > >> >>> > > >>>>> Hey David,
>> > >> > >> >>> > > >>>>>
>> > >> > >> >>> > > >>>>> RE: Async: I was trying to match the pattern we
>> > >> > >> >>> > > >>>>> use
>> > >> > >> >>> > > >>>>> for
>> > >> > >> >>> > > >>>>> doget/doput
>> > >> > >> >>> > > >>>>> for
>> > >> > >> >>> > > >>>>> async. Yes, more thinking java given java grpc's
>> > async
>> > >> > >> >>> > > >>>>> always
>> > >> > >> >>> > pattern.
>> > >> > >> >>> > > >>>>>
>> > >> > >> >>> > > >>>>> On the comment around the FlightData, I think it
>> > >> > >> >>> > > >>>>> is
>> > >> > >> >>> > > >>>>> overloading
>> > >> > >> >>> > > >>>>> the
>> > >> > >> >>> > > >>>> message
>> > >> > >> >>> > > >>>>> to use metadata for this. If I want to send a
>> control
>> > >> > >> >>> > > >>>>> message
>> > >> > >> >>> > > >>>> independently
>> > >> > >> >>> > > >>>>> of the data message, I would have to define
>> something
>> > >> > >> >>> > > >>>>> like
>> > >> > >> >>> > > >>>>> an
>> > >> > >> >>> > > >>>>> empty
>> > >> > >> >>> > > >>>> flight
>> > >> > >> >>> > > >>>>> data message that has custom metadata. Why not
>> > support
>> > >> > >> >>> > > >>>>> a
>> > >> > >> >>> > > >>>>> container
>> > >> > >> >>> > > >>>>> object
>> > >> > >> >>> > > >>>>> with a oneof{FlightData, Any} in it instead so
>> users
>> > >> > >> >>> > > >>>>> can
>> > >> > >> >>> > > >>>>> add
>> > >> > >> >>> > > >>>>> more
>> > >> > >> >>> > data
>> > >> > >> >>> > > >>>>> as
>> > >> > >> >>> > > >>>>> desired. The default impl could be a noop for the
>> Any
>> > >> > >> >>> > > >>>>> messages.
>> > >> > >> >>> > > >>>>>
>> > >> > >> >>> > > >>>>> On Tue, Oct 15, 2019 at 6:50 PM David Li
>> > >> > >> >>> > > >>>>> <li.david...@gmail.com>
>> > >> > >> >>> > > >>>>> wrote:
>> > >> > >> >>> > > >>>>>
>> > >> > >> >>> > > >>>>>> Hi Jacques,
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>>> Thanks for the comments.
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>>> - I do agree DoExchange is a better name!
>> > >> > >> >>> > > >>>>>> - FlightData already has metadata fields as a
>> result
>> > >> > >> >>> > > >>>>>> of
>> > >> > >> prior
>> > >> > >> >>> > > >>>>>> proposals, so I don't think we need a new
>> > >> > >> >>> > > >>>>>> message
>> to
>> > >> > carry
>> > >> > >> >>> > > >>>>>> that
>> > >> > >> >>> > kind
>> > >> > >> >>> > > >>>>>> of information.
>> > >> > >> >>> > > >>>>>> - I like the suggestion of an async handler to
>> > handle
>> > >> > >> >>> > > >>>>>> incoming
>> > >> > >> >>> > > >>>>>> messages as the fundamental API; it would
>> > >> > >> >>> > > >>>>>> actually
>> > be
>> > >> > >> >>> > > >>>>>> quite
>> > >> > >> >>> > natural
>> > >> > >> >>> > > >>>>>> to
>> > >> > >> >>> > > >>>>>> implement in Flight/Java. I will note that it's
>> not
>> > >> > >> >>> > > >>>>>> possible
>> > >> > >> >>> > > >>>>>> in
>> > >> > >> >>> > > >>>>>> C++/Python without spawning a thread, though.
>> > >> > >> >>> > > >>>>>> (In
>> > >> > essence,
>> > >> > >> >>> > gRPC-Java
>> > >> > >> >>> > > >>>>>> is async-always and gRPC-C++ is sync-always.)
>> There
>> > >> > >> >>> > > >>>>>> are
>> > >> > >> >>> > experimental
>> > >> > >> >>> > > >>>>>> C++ APIs that would let us do something similar
>> > >> > >> >>> > > >>>>>> to
>> > >> > >> >>> > > >>>>>> Java,
>> > >> > >> >>> > > >>>>>> but
>> > >> > >> >>> > > >>>>>> those
>> > >> > >> >>> > > >>>>>> are
>> > >> > >> >>> > > >>>>>> only in relatively recent gRPC versions and are
>> > still
>> > >> > >> >>> > > >>>>>> under
>> > >> > >> >>> > > >>>>>> development (contrary to the interceptor APIs
>> which
>> > >> > >> >>> > > >>>>>> have
>> > >> > >> been
>> > >> > >> >>> > around
>> > >> > >> >>> > > >>>>>> for quite a while).
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>>> Thanks,
>> > >> > >> >>> > > >>>>>> David
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>>> On 10/15/19, Jacques Nadeau <jacq...@apache.org>
>> > >> > >> >>> > > >>>>>> wrote:
>> > >> > >> >>> > > >>>>>>> I like it. Added some comments to the doc.
>> > >> > >> >>> > > >>>>>>> Might
>> > >> > >> >>> > > >>>>>>> worth
>> > >> > >> >>> > > >>>>>>> discussion
>> > >> > >> >>> > > >>>>>>> here
>> > >> > >> >>> > > >>>>>>> depending on your thoughts.
>> > >> > >> >>> > > >>>>>>>
>> > >> > >> >>> > > >>>>>>> On Tue, Oct 15, 2019 at 7:11 AM David Li
>> > >> > >> >>> > > >>>>>>> <li.david...@gmail.com>
>> > >> > >> >>> > > >>>> wrote:
>> > >> > >> >>> > > >>>>>>>
>> > >> > >> >>> > > >>>>>>>> Hey Ryan,
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> Thanks for the comments.
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> Concrete example: I've edited the doc to
>> provide a
>> > >> > >> >>> > > >>>>>>>> Python
>> > >> > >> >>> > strawman.
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> Sync vs async: while I don't touch on it, you
>> > could
>> > >> > >> >>> > > >>>>>>>> interleave
>> > >> > >> >>> > > >>>> uploads
>> > >> > >> >>> > > >>>>>>>> and downloads if you were so inclined. Right
>> now,
>> > >> > >> >>> > > >>>>>>>> synchronous
>> > >> > >> >>> > APIs
>> > >> > >> >>> > > >>>>>>>> make this error-prone, e.g. if both client and
>> > >> > >> >>> > > >>>>>>>> server
>> > >> > >> >>> > > >>>>>>>> wait
>> > >> > >> >>> > > >>>>>>>> for
>> > >> > >> >>> > each
>> > >> > >> >>> > > >>>>>>>> other due to an application logic bug. (gRPC
>> > >> > >> >>> > > >>>>>>>> doesn't
>> > >> > >> >>> > > >>>>>>>> give
>> > >> > >> >>> > > >>>>>>>> us
>> > >> > >> >>> > > >>>>>>>> the
>> > >> > >> >>> > > >>>>>>>> ability to have per-read timeouts, only an
>> overall
>> > >> > >> >>> > > >>>>>>>> timeout.)
>> > >> > >> >>> > > >>>>>>>> As
>> > >> > >> >>> > an
>> > >> > >> >>> > > >>>>>>>> example of this happening with DoPut, see
>> > >> > >> >>> > > >>>>>>>> ARROW-6063:
>> > >> > >> >>> > > >>>>>>>>
>> https://issues.apache.org/jira/browse/ARROW-6063
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> This is mostly tangential though, eventually
>> > >> > >> >>> > > >>>>>>>> we
>> > >> > >> >>> > > >>>>>>>> will
>> > >> > >> >>> > > >>>>>>>> want
>> > >> > >> >>> > > >>>>>>>> to
>> > >> > >> >>> > design
>> > >> > >> >>> > > >>>>>>>> asynchronous APIs for Flight as a whole. A
>> > >> > bidirectional
>> > >> > >> >>> > > >>>>>>>> stream
>> > >> > >> >>> > > >>>>>>>> like
>> > >> > >> >>> > > >>>>>>>> this (and like DoPut) just makes these
>> > >> > >> >>> > > >>>>>>>> pitfalls
>> > >> > >> >>> > > >>>>>>>> easier
>> > >> > >> >>> > > >>>>>>>> to
>> > >> > >> >>> > > >>>>>>>> run
>> > >> > >> >>> > into.
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> Using DoPut+DoGet: I discussed this in the
>> > >> > >> >>> > > >>>>>>>> proposal,
>> > >> > but
>> > >> > >> >>> > > >>>>>>>> the
>> > >> > >> >>> > main
>> > >> > >> >>> > > >>>>>>>> concern is that depending on how you deploy,
>> > >> > >> >>> > > >>>>>>>> two
>> > >> > >> >>> > > >>>>>>>> separate
>> > >> > >> >>> > > >>>>>>>> calls
>> > >> > >> >>> > > >>>>>>>> could
>> > >> > >> >>> > > >>>>>>>> get routed to different instances.
>> > >> > >> >>> > > >>>>>>>> Additionally,
>> > >> > >> >>> > > >>>>>>>> gRPC
>> > >> > >> >>> > > >>>>>>>> has
>> > >> > >> >>> > > >>>>>>>> some
>> > >> > >> >>> > > >>>>>>>> reconnection behaviors; if the server goes
>> > >> > >> >>> > > >>>>>>>> away
>> in
>> > >> > >> >>> > > >>>>>>>> between
>> > >> > >> >>> > > >>>>>>>> the
>> > >> > >> >>> > two
>> > >> > >> >>> > > >>>>>>>> calls, but it then restarts or there is
>> > >> > >> >>> > > >>>>>>>> another
>> > >> > instance
>> > >> > >> >>> > available,
>> > >> > >> >>> > > >>>>>>>> the client will happily reconnect to the new
>> > server
>> > >> > >> without
>> > >> > >> >>> > > >>>>>>>> warning.
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> Thanks,
>> > >> > >> >>> > > >>>>>>>> David
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>> On 10/15/19, Ryan Murray <rym...@dremio.com>
>> > wrote:
>> > >> > >> >>> > > >>>>>>>>> Hey David,
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> I think this proposal makes a lot of sense. I
>> > like
>> > >> > >> >>> > > >>>>>>>>> it
>> > >> > >> >>> > > >>>>>>>>> and
>> > >> > >> >>> > > >>>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>> possibility
>> > >> > >> >>> > > >>>>>>>>> of remote compute via arrow buffers. One
>> > >> > >> >>> > > >>>>>>>>> thing
>> > >> > >> >>> > > >>>>>>>>> that
>> > >> > >> >>> > > >>>>>>>>> would
>> > >> > >> >>> > > >>>>>>>>> help
>> > >> > >> >>> > me
>> > >> > >> >>> > > >>>>>> would
>> > >> > >> >>> > > >>>>>>>> be
>> > >> > >> >>> > > >>>>>>>>> a concrete example of the API in a real life
>> use
>> > >> > >> >>> > > >>>>>>>>> case.
>> > >> > >> >>> > > >>>>>>>>> Also,
>> > >> > >> >>> > what
>> > >> > >> >>> > > >>>>>> would
>> > >> > >> >>> > > >>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>> client experience be in terms of sync vs
>> > >> > >> >>> > > >>>>>>>>> asyc?
>> > >> > >> >>> > > >>>>>>>>> Would
>> > >> > >> >>> > > >>>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>> client
>> > >> > >> >>> > > >>>>>>>>> block
>> > >> > >> >>> > > >>>>>>>> till
>> > >> > >> >>> > > >>>>>>>>> the bidirectional call return ie c =
>> > >> > >> flight.vector_mult(a,
>> > >> > >> >>> > > >>>>>>>>> b)
>> > >> > >> >>> > or
>> > >> > >> >>> > > >>>>>>>>> would
>> > >> > >> >>> > > >>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>> client wait to be signaled that computation
>> > >> > >> >>> > > >>>>>>>>> was
>> > >> > >> >>> > > >>>>>>>>> done.
>> > >> > >> >>> > > >>>>>>>>> If
>> > >> > >> >>> > > >>>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>> later
>> > >> > >> >>> > > >>>>>>>>> how
>> > >> > >> >>> > > >>>>>>>>> is
>> > >> > >> >>> > > >>>>>>>>> that different from a DoPut then DoGet? I
>> suppose
>> > >> > >> >>> > > >>>>>>>>> that
>> > >> > >> >>> > > >>>>>>>>> this
>> > >> > >> >>> > could
>> > >> > >> >>> > > >>>> be
>> > >> > >> >>> > > >>>>>>>>> implemented without extending the RPC
>> > >> > >> >>> > > >>>>>>>>> interface
>> > >> > >> >>> > > >>>>>>>>> but
>> > >> > >> rather
>> > >> > >> >>> > > >>>>>>>>> by a
>> > >> > >> >>> > > >>>>>>>>> function/util?
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> Best,
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> Ryan
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> On Sun, Oct 13, 2019 at 9:24 PM David Li <
>> > >> > >> >>> > li.david...@gmail.com>
>> > >> > >> >>> > > >>>>>> wrote:
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>> Hi all,
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>> We've been using Flight quite successfully
>> > >> > >> >>> > > >>>>>>>>>> so
>> > >> > >> >>> > > >>>>>>>>>> far,
>> > >> > but
>> > >> > >> we
>> > >> > >> >>> > > >>>>>>>>>> have
>> > >> > >> >>> > > >>>>>>>>>> identified a new use case on the horizon:
>> being
>> > >> > >> >>> > > >>>>>>>>>> able
>> > >> > >> >>> > > >>>>>>>>>> to
>> > >> > >> >>> > > >>>>>>>>>> both
>> > >> > >> >>> > > >>>>>>>>>> send
>> > >> > >> >>> > > >>>>>>>>>> and
>> > >> > >> >>> > > >>>>>>>>>> retrieve Arrow data within a single RPC
>> > >> > >> >>> > > >>>>>>>>>> call.
>> To
>> > >> > >> >>> > > >>>>>>>>>> that
>> > >> > >> >>> > > >>>>>>>>>> end,
>> > >> > >> >>> > I've
>> > >> > >> >>> > > >>>>>>>>>> written up a proposal for a new RPC method:
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> >
>> > >> > >>
>> > >> >
>> >
>> https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>> Please let me know if you can't view or
>> comment
>> > >> > >> >>> > > >>>>>>>>>> on
>> > >> > the
>> > >> > >> >>> > document.
>> > >> > >> >>> > > >>>>>>>>>> I'd
>> > >> > >> >>> > > >>>>>>>>>> appreciate any feedback; I think this is a
>> > >> > >> >>> > > >>>>>>>>>> relatively
>> > >> > >> >>> > > >>>>>>>>>> straightforward
>> > >> > >> >>> > > >>>>>>>>>> addition - it is essentially "DoPutThenGet".
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>> This is a format change and would require a
>> > vote.
>> > >> > I've
>> > >> > >> >>> > > >>>>>>>>>> decided
>> > >> > >> >>> > > >>>>>>>>>> to
>> > >> > >> >>> > > >>>>>>>>>> table the other format change I had proposed
>> (on
>> > >> > >> >>> > > >>>>>>>>>> DoPut),
>> > >> > >> >>> > > >>>>>>>>>> as
>> > >> > >> >>> > > >>>>>>>>>> it
>> > >> > >> >>> > > >>>>>> doesn't
>> > >> > >> >>> > > >>>>>>>>>> functionally change Flight, just the
>> > >> > >> >>> > > >>>>>>>>>> interpretation
>> > >> > of
>> > >> > >> >>> > > >>>>>>>>>> the
>> > >> > >> >>> > > >>>>>>>>>> semantics.
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>> Thanks,
>> > >> > >> >>> > > >>>>>>>>>> David
>> > >> > >> >>> > > >>>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> --
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> Ryan Murray  | Principal Consulting Engineer
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> +447540852009 | rym...@dremio.com
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>> <https://www.dremio.com/>
>> > >> > >> >>> > > >>>>>>>>> Check out our GitHub
>> > >> > >> >>> > > >>>>>>>>> <https://www.github.com/dremio>,
>> > >> > >> join
>> > >> > >> >>> > > >>>>>>>>> our
>> > >> > >> >>> > > >>>>>>>>> community
>> > >> > >> >>> > > >>>>>>>>> site <https://community.dremio.com/> &
>> Download
>> > >> > Dremio
>> > >> > >> >>> > > >>>>>>>>> <https://www.dremio.com/download>
>> > >> > >> >>> > > >>>>>>>>>
>> > >> > >> >>> > > >>>>>>>>
>> > >> > >> >>> > > >>>>>>>
>> > >> > >> >>> > > >>>>>>
>> > >> > >> >>> > > >>>>>
>> > >> > >> >>> > > >>>>
>> > >> > >> >>> > > >>>
>> > >> > >> >>> > > >
>> > >> > >> >>> >
>> > >> > >> >>
>> > >> > >> >
>> > >> > >>
>> > >> > >
>> > >> >
>> > >
>> >
>>
>

Reply via email to