Hey Andy, I've been rather busy unfortunately. I had started on an implementation in C++ to provide as part of this discussion, but it's not complete. I'm hoping to have more done in March.
Best, David On 2/25/20, Andy Grove <andygrov...@gmail.com> wrote: > I was wondering if there had been any momentum on this (the BiDirectional > RPC design)? > > I'm interested in this for the use case of Apache Spark sending a stream of > data to another process to invoke custom code and then receive a stream > back with the transformed data. > > Thanks, > > Andy. > > > > On Fri, Dec 13, 2019 at 12:12 PM Jacques Nadeau <jacq...@apache.org> wrote: > >> I support moving forward with the current proposal. >> >> On Thu, Dec 12, 2019 at 12:20 PM David Li <li.david...@gmail.com> wrote: >> >> > Just following up here again, any other thoughts? >> > >> > I think we do have justifications for potentially separate streams in >> > a call, but that's more of an orthogonal question - it doesn't need to >> > be addressed here. I do agree that it very much complicates things. >> > >> > Thanks, >> > David >> > >> > On 11/29/19, Wes McKinney <wesmck...@gmail.com> wrote: >> > > I would generally agree with this. Note that you have the possibility >> > > to use unions-of-structs to send record batches with different >> > > schemas >> > > in the same stream, though with some added complexity on each side >> > > >> > > On Thu, Nov 28, 2019 at 10:37 AM Jacques Nadeau <jacq...@apache.org> >> > wrote: >> > >> >> > >> I'd vote for explicitly not supported. We should keep our primitives >> > >> narrow. >> > >> >> > >> On Wed, Nov 27, 2019, 1:17 PM David Li <li.david...@gmail.com> >> > >> wrote: >> > >> >> > >> > Thanks for the feedback. >> > >> > >> > >> > I do think if we had explicitly embraced gRPC from the beginning, >> > >> > there are a lot of places where things could be made more >> > >> > ergonomic, >> > >> > including with the metadata fields. But it would also have locked >> out >> > >> > us of potential future transports. >> > >> > >> > >> > On another note: I hesitate to put too much into this method, but >> > >> > we >> > >> > are looking at use cases where potentially, a client may want to >> > >> > upload multiple distinct datasets (with differing schemas). (This >> is a >> > >> > little tentative, and I can get more details...) Right now, each >> > >> > logical stream in Flight must have a single, consistent schema; >> would >> > >> > it make sense to look at ways to relax this, or declare this >> > >> > explicitly out of scope (and require multiple calls and >> > >> > coordination >> > >> > with the deployment topology) in order to accomplish this? >> > >> > >> > >> > Best, >> > >> > David >> > >> > >> > >> > On 11/27/19, Jacques Nadeau <jacq...@apache.org> wrote: >> > >> > > Fair enough. I'm okay with the bytes approach and the proposal >> looks >> > >> > > good >> > >> > > to me. >> > >> > > >> > >> > > On Fri, Nov 8, 2019 at 11:37 AM David Li <li.david...@gmail.com> >> > >> > > wrote: >> > >> > > >> > >> > >> I've updated the proposal. >> > >> > >> >> > >> > >> On the subject of Protobuf Any vs bytes, and how to handle >> > >> > >> errors/metadata, I still think using bytes is preferable: >> > >> > >> - It doesn't require (conditionally) exposing or wrapping >> Protobuf >> > >> > types, >> > >> > >> - We wouldn't be able to practically expose the Protobuf field >> > >> > >> to >> > >> > >> C++ >> > >> > >> users without causing build pains, >> > >> > >> - We can't let Python users take advantage of the Protobuf >> > >> > >> field >> > >> > >> without somehow being compatible with the Protobuf wheels (by >> > >> > >> linking >> > >> > >> to the same version, and doing magic to turn the C++ Protobufs >> into >> > >> > >> the Python ones), >> > >> > >> - All our other application-defined fields are already bytes. >> > >> > >> >> > >> > >> Applications that want structure can encode JSON or Protobuf >> > >> > >> Any >> > >> > >> into >> > >> > >> the bytes field themselves, much as you can already do for >> Ticket, >> > >> > >> commands in FlightDescriptors, and application metadata in >> > >> > >> DoGet/DoPut. I don't think this is (much) less efficient than >> using >> > >> > >> Any directly, since Any itself is a bytes field with a tag, and >> > must >> > >> > >> invoke the Protobuf deserializer again to read the actual >> message. >> > >> > >> >> > >> > >> If we decide on using bytes, then I don't think it makes sense >> > >> > >> to >> > >> > >> define a new message with a oneof either, since it would be >> > >> > >> redundant. >> > >> > >> >> > >> > >> Thanks, >> > >> > >> David >> > >> > >> >> > >> > >> On 11/7/19, David Li <li.david...@gmail.com> wrote: >> > >> > >> > I've been extremely backlogged, I will update the proposal >> when I >> > >> > >> > get >> > >> > >> > a chance and reply here when done. >> > >> > >> > >> > >> > >> > Best, >> > >> > >> > David >> > >> > >> > >> > >> > >> > On 11/7/19, Wes McKinney <wesmck...@gmail.com> wrote: >> > >> > >> >> Bumping this discussion since a couple of weeks have passed. >> It >> > >> > >> >> seems >> > >> > >> >> there are still some questions here, could we summarize what >> are >> > >> > >> >> the >> > >> > >> >> alternatives along with any public API implications so we >> > >> > >> >> can >> > try >> > >> > >> >> to >> > >> > >> >> render a decision? >> > >> > >> >> >> > >> > >> >> On Sat, Oct 26, 2019 at 7:19 PM David Li < >> li.david...@gmail.com >> > > >> > >> > >> >> wrote: >> > >> > >> >>> >> > >> > >> >>> Hi Wes, >> > >> > >> >>> >> > >> > >> >>> Responses inline: >> > >> > >> >>> >> > >> > >> >>> On Sat, Oct 26, 2019, 13:46 Wes McKinney < >> wesmck...@gmail.com> >> > >> > wrote: >> > >> > >> >>> >> > >> > >> >>> > On Mon, Oct 21, 2019 at 7:40 PM David Li >> > >> > >> >>> > <li.david...@gmail.com> >> > >> > >> >>> > wrote: >> > >> > >> >>> > > >> > >> > >> >>> > > The question is whether to repurpose the existing >> > FlightData >> > >> > >> >>> > > structure, and allow for the metadata field to be >> > >> > >> >>> > > filled >> in >> > >> > >> >>> > > and >> > >> > >> data >> > >> > >> >>> > > fields to be blank (as a control message), or to wrap >> > >> > >> >>> > > the >> > >> > >> FlightData >> > >> > >> >>> > > structure in another structure that explicitly >> > distinguishes >> > >> > >> between >> > >> > >> >>> > > control and data messages. >> > >> > >> >>> > >> > >> > >> >>> > I'm not super against having metadata-only FlightData >> > >> > >> >>> > with >> > >> > >> >>> > empty >> > >> > >> body. >> > >> > >> >>> > One question to consider is what changes (if any) would >> need >> > to >> > >> > >> >>> > be >> > >> > >> >>> > made to public APIs in either scenario. >> > >> > >> >>> > >> > >> > >> >>> >> > >> > >> >>> We could leave DoGet/DoPut as-is for now, and allow empty >> data >> > >> > >> >>> messages >> > >> > >> >>> in >> > >> > >> >>> the future. This would be a breaking change, but wouldn't >> > change >> > >> > >> >>> the >> > >> > >> >>> wire >> > >> > >> >>> format. I think the APIs could be changed backwards >> compatibly, >> > >> > >> >>> though. >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> > > The other question is how to handle the metadata >> > >> > >> >>> > > fields. >> So >> > >> > >> >>> > > far, >> > >> > >> >>> > > we've >> > >> > >> >>> > > used bytestring fields for application-defined data. >> > >> > >> >>> > > This >> > is >> > >> > >> >>> > > workable >> > >> > >> >>> > > if you want to use Protobuf to define the contents of >> those >> > >> > >> >>> > > fields, >> > >> > >> >>> > > but requires you to pack/unpack your Protobuf into/from >> the >> > >> > >> >>> > > bytestring >> > >> > >> >>> > > field. If we instead used the Protobuf Any field, a >> > >> > >> >>> > > dynamically >> > >> > >> >>> > > typed >> > >> > >> >>> > > field, this would be more convenient, but then we'd be >> > >> > >> >>> > > exposing >> > >> > >> >>> > > Protobuf types. We could alternatively use a >> > >> > >> >>> > > combination >> of >> > >> > >> >>> > > a >> > >> > >> >>> > > type >> > >> > >> >>> > > field and a bytestring field, mimicking what the >> > >> > >> >>> > > Protobuf >> > >> > >> >>> > > Any >> > >> > >> >>> > > type >> > >> > >> >>> > > looks like on the wire. I'm not sure this is actually >> > cleaner >> > >> > >> >>> > > in >> > >> > >> any >> > >> > >> >>> > > of the language APIs, though. >> > >> > >> >>> > >> > >> > >> >>> > Leaving the deserialization of the app metadata to the >> > >> > >> >>> > particular >> > >> > >> >>> > Flight implementation seems on first principles like the >> most >> > >> > >> flexible >> > >> > >> >>> > thing, if Any is used, does that mean the metadata _must_ >> be >> > a >> > >> > >> >>> > protobuf? >> > >> > >> >>> > >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> If Any is used, we could still expose a bytes-based API, >> > >> > >> >>> but >> it >> > >> > would >> > >> > >> >>> have >> > >> > >> >>> some more wrapping. (We could put a ByteString in Any.) >> > >> > >> >>> Then >> > the >> > >> > >> >>> question >> > >> > >> >>> would just be how to expose this (would be easier in Java, >> > harder >> > >> > >> >>> in >> > >> > >> >>> C++). >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> > > David >> > >> > >> >>> > > >> > >> > >> >>> > > On 10/21/19, Antoine Pitrou <anto...@python.org> wrote: >> > >> > >> >>> > > > >> > >> > >> >>> > > > Can one of you explain what is being proposed in >> > >> > >> >>> > > > non-protobuf >> > >> > >> >>> > > > terms? >> > >> > >> >>> > > > Knowledge of protobuf shouldn't be required to use >> > Flight. >> > >> > >> >>> > > > >> > >> > >> >>> > > > Regards >> > >> > >> >>> > > > >> > >> > >> >>> > > > Antoine. >> > >> > >> >>> > > > >> > >> > >> >>> > > > >> > >> > >> >>> > > > Le 21/10/2019 à 15:46, David Li a écrit : >> > >> > >> >>> > > >> Oneof doesn't actually change the wire encoding; it >> > would >> > >> > just >> > >> > >> be >> > >> > >> >>> > > >> application-level logic. (The official guide doesn't >> > even >> > >> > >> mention >> > >> > >> >>> > > >> it >> > >> > >> >>> > > >> in the encoding docs; I found >> > >> > >> >>> > > >> >> > >> > >> >>> > >> > >> > >> >> > >> > >> > >> https://stackoverflow.com/questions/52226409/how-protobuf-encodes-oneof-message-construct >> > >> > >> >>> > > >> as well.) >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> If I follow you, Jacques, then you are proposing >> > >> > >> >>> > > >> essentially >> > >> > >> >>> > > >> inlining >> > >> > >> >>> > > >> the definition of Any, e.g. >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> message FlightMessage { >> > >> > >> >>> > > >> oneof message { >> > >> > >> >>> > > >> FlightData data = 1; >> > >> > >> >>> > > >> FlightAny metadata = 2; >> > >> > >> >>> > > >> } >> > >> > >> >>> > > >> } >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> message FlightAny { >> > >> > >> >>> > > >> string type = 1; >> > >> > >> >>> > > >> bytes data = 2; >> > >> > >> >>> > > >> } >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> Is this correct? >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> It might be nice to consider the wrapper message for >> > >> > >> >>> > > >> DoGet/DoPut >> > >> > >> >>> > > >> as >> > >> > >> >>> > > >> well, but at that point, I'd rather we be consistent >> > with >> > >> > >> >>> > > >> all >> > >> > >> >>> > > >> of >> > >> > >> >>> > > >> them, >> > >> > >> >>> > > >> rather than have one of the three methods do its own >> > >> > >> >>> > > >> thing. >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> Thanks, >> > >> > >> >>> > > >> David >> > >> > >> >>> > > >> >> > >> > >> >>> > > >> On 10/20/19, Jacques Nadeau <jacq...@apache.org> >> wrote: >> > >> > >> >>> > > >>> I think we could probably expose the oneof behavior >> > >> > >> >>> > > >>> without >> > >> > >> >>> > > >>> exposing >> > >> > >> >>> > the >> > >> > >> >>> > > >>> protobuf functions. On the any... hmm. I guess we >> could >> > >> > >> >>> > > >>> expose >> > >> > >> >>> > > >>> as >> > >> > >> >>> > > >>> two >> > >> > >> >>> > > >>> fields: type and data. Then users could use it for >> > >> > >> >>> > > >>> whatever >> > >> > >> >>> > > >>> but >> > >> > >> >>> > > >>> if >> > >> > >> >>> > > >>> people >> > >> > >> >>> > > >>> wanted to treat it as any, it would work. >> > >> > >> >>> > > >>> (Basically >> a >> > >> > >> >>> > > >>> user >> > >> > >> >>> > > >>> could >> > >> > >> >>> > > >>> use >> > >> > >> >>> > > >>> any >> > >> > >> >>> > > >>> with it easily but they could also use any other >> > >> > >> >>> > > >>> mechanism). >> > >> > >> >>> > > >>> At >> > >> > >> >>> > least in >> > >> > >> >>> > > >>> java, the any concepts are pretty simple/diy. Are >> other >> > >> > >> language >> > >> > >> >>> > > >>> bindings >> > >> > >> >>> > > >>> less diy? >> > >> > >> >>> > > >>> >> > >> > >> >>> > > >>> I'm *not* hardcore against the empty FlightData + >> > >> > >> >>> > > >>> metadata >> > >> > >> >>> > > >>> but >> > >> > >> >>> > > >>> it >> > >> > >> >>> > just >> > >> > >> >>> > > >>> seemed a bit janky. >> > >> > >> >>> > > >>> >> > >> > >> >>> > > >>> Thinking about the control message/wrapper object >> > thing, >> > >> > >> >>> > > >>> I >> > >> > >> >>> > > >>> wonder >> > >> > >> >>> > > >>> if >> > >> > >> >>> > we >> > >> > >> >>> > > >>> should redefine DoPut and DoGet to have the same >> > property >> > >> > >> >>> > > >>> if >> > >> > >> >>> > > >>> we >> > >> > >> >>> > think it >> > >> > >> >>> > > >>> is >> > >> > >> >>> > > >>> a good idea... >> > >> > >> >>> > > >>> >> > >> > >> >>> > > >>> On Wed, Oct 16, 2019 at 5:13 PM David Li < >> > >> > >> li.david...@gmail.com> >> > >> > >> >>> > wrote: >> > >> > >> >>> > > >>> >> > >> > >> >>> > > >>>> I was definitely considering having control >> > >> > >> >>> > > >>>> messages >> > >> > without >> > >> > >> >>> > > >>>> data, >> > >> > >> >>> > and >> > >> > >> >>> > > >>>> I thought that could be encoded by a FlightData >> > >> > >> >>> > > >>>> with >> > >> > >> >>> > > >>>> only >> > >> > >> >>> > app_metadata >> > >> > >> >>> > > >>>> set. I think I understand your position now: >> > FlightData >> > >> > >> >>> > > >>>> should >> > >> > >> >>> > always >> > >> > >> >>> > > >>>> carry (some) data (with optional metadata)? >> > >> > >> >>> > > >>>> >> > >> > >> >>> > > >>>> That makes sense to me, and is consistent with the >> > >> > >> >>> > > >>>> documentation >> > >> > >> >>> > > >>>> on >> > >> > >> >>> > > >>>> FlightData in the Protobuf file. I was worried >> > >> > >> >>> > > >>>> about >> > >> > >> >>> > > >>>> having >> > >> > >> >>> > > >>>> a >> > >> > >> >>> > > >>>> redundant metadata field, but oneof prevents that >> from >> > >> > >> >>> > > >>>> happening, >> > >> > >> >>> > and >> > >> > >> >>> > > >>>> overall having a clear separation between data and >> > >> > >> >>> > > >>>> control >> > >> > >> >>> > > >>>> messages >> > >> > >> >>> > is >> > >> > >> >>> > > >>>> cleaner. >> > >> > >> >>> > > >>>> >> > >> > >> >>> > > >>>> As for using Protobuf's Any: so far, we've >> > >> > >> >>> > > >>>> refrained >> > >> > >> >>> > > >>>> from >> > >> > >> >>> > > >>>> exposing >> > >> > >> >>> > > >>>> Protobuf by using bytes, would we want to change >> that >> > >> > >> >>> > > >>>> now? >> > >> > >> >>> > > >>>> >> > >> > >> >>> > > >>>> Best, >> > >> > >> >>> > > >>>> David >> > >> > >> >>> > > >>>> >> > >> > >> >>> > > >>>> On 10/16/19, Jacques Nadeau <jacq...@apache.org> >> > wrote: >> > >> > >> >>> > > >>>>> Hey David, >> > >> > >> >>> > > >>>>> >> > >> > >> >>> > > >>>>> RE: Async: I was trying to match the pattern we >> > >> > >> >>> > > >>>>> use >> > >> > >> >>> > > >>>>> for >> > >> > >> >>> > > >>>>> doget/doput >> > >> > >> >>> > > >>>>> for >> > >> > >> >>> > > >>>>> async. Yes, more thinking java given java grpc's >> > async >> > >> > >> >>> > > >>>>> always >> > >> > >> >>> > pattern. >> > >> > >> >>> > > >>>>> >> > >> > >> >>> > > >>>>> On the comment around the FlightData, I think it >> > >> > >> >>> > > >>>>> is >> > >> > >> >>> > > >>>>> overloading >> > >> > >> >>> > > >>>>> the >> > >> > >> >>> > > >>>> message >> > >> > >> >>> > > >>>>> to use metadata for this. If I want to send a >> control >> > >> > >> >>> > > >>>>> message >> > >> > >> >>> > > >>>> independently >> > >> > >> >>> > > >>>>> of the data message, I would have to define >> something >> > >> > >> >>> > > >>>>> like >> > >> > >> >>> > > >>>>> an >> > >> > >> >>> > > >>>>> empty >> > >> > >> >>> > > >>>> flight >> > >> > >> >>> > > >>>>> data message that has custom metadata. Why not >> > support >> > >> > >> >>> > > >>>>> a >> > >> > >> >>> > > >>>>> container >> > >> > >> >>> > > >>>>> object >> > >> > >> >>> > > >>>>> with a oneof{FlightData, Any} in it instead so >> users >> > >> > >> >>> > > >>>>> can >> > >> > >> >>> > > >>>>> add >> > >> > >> >>> > > >>>>> more >> > >> > >> >>> > data >> > >> > >> >>> > > >>>>> as >> > >> > >> >>> > > >>>>> desired. The default impl could be a noop for the >> Any >> > >> > >> >>> > > >>>>> messages. >> > >> > >> >>> > > >>>>> >> > >> > >> >>> > > >>>>> On Tue, Oct 15, 2019 at 6:50 PM David Li >> > >> > >> >>> > > >>>>> <li.david...@gmail.com> >> > >> > >> >>> > > >>>>> wrote: >> > >> > >> >>> > > >>>>> >> > >> > >> >>> > > >>>>>> Hi Jacques, >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>>>> Thanks for the comments. >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>>>> - I do agree DoExchange is a better name! >> > >> > >> >>> > > >>>>>> - FlightData already has metadata fields as a >> result >> > >> > >> >>> > > >>>>>> of >> > >> > >> prior >> > >> > >> >>> > > >>>>>> proposals, so I don't think we need a new >> > >> > >> >>> > > >>>>>> message >> to >> > >> > carry >> > >> > >> >>> > > >>>>>> that >> > >> > >> >>> > kind >> > >> > >> >>> > > >>>>>> of information. >> > >> > >> >>> > > >>>>>> - I like the suggestion of an async handler to >> > handle >> > >> > >> >>> > > >>>>>> incoming >> > >> > >> >>> > > >>>>>> messages as the fundamental API; it would >> > >> > >> >>> > > >>>>>> actually >> > be >> > >> > >> >>> > > >>>>>> quite >> > >> > >> >>> > natural >> > >> > >> >>> > > >>>>>> to >> > >> > >> >>> > > >>>>>> implement in Flight/Java. I will note that it's >> not >> > >> > >> >>> > > >>>>>> possible >> > >> > >> >>> > > >>>>>> in >> > >> > >> >>> > > >>>>>> C++/Python without spawning a thread, though. >> > >> > >> >>> > > >>>>>> (In >> > >> > essence, >> > >> > >> >>> > gRPC-Java >> > >> > >> >>> > > >>>>>> is async-always and gRPC-C++ is sync-always.) >> There >> > >> > >> >>> > > >>>>>> are >> > >> > >> >>> > experimental >> > >> > >> >>> > > >>>>>> C++ APIs that would let us do something similar >> > >> > >> >>> > > >>>>>> to >> > >> > >> >>> > > >>>>>> Java, >> > >> > >> >>> > > >>>>>> but >> > >> > >> >>> > > >>>>>> those >> > >> > >> >>> > > >>>>>> are >> > >> > >> >>> > > >>>>>> only in relatively recent gRPC versions and are >> > still >> > >> > >> >>> > > >>>>>> under >> > >> > >> >>> > > >>>>>> development (contrary to the interceptor APIs >> which >> > >> > >> >>> > > >>>>>> have >> > >> > >> been >> > >> > >> >>> > around >> > >> > >> >>> > > >>>>>> for quite a while). >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>>>> Thanks, >> > >> > >> >>> > > >>>>>> David >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>>>> On 10/15/19, Jacques Nadeau <jacq...@apache.org> >> > >> > >> >>> > > >>>>>> wrote: >> > >> > >> >>> > > >>>>>>> I like it. Added some comments to the doc. >> > >> > >> >>> > > >>>>>>> Might >> > >> > >> >>> > > >>>>>>> worth >> > >> > >> >>> > > >>>>>>> discussion >> > >> > >> >>> > > >>>>>>> here >> > >> > >> >>> > > >>>>>>> depending on your thoughts. >> > >> > >> >>> > > >>>>>>> >> > >> > >> >>> > > >>>>>>> On Tue, Oct 15, 2019 at 7:11 AM David Li >> > >> > >> >>> > > >>>>>>> <li.david...@gmail.com> >> > >> > >> >>> > > >>>> wrote: >> > >> > >> >>> > > >>>>>>> >> > >> > >> >>> > > >>>>>>>> Hey Ryan, >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> Thanks for the comments. >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> Concrete example: I've edited the doc to >> provide a >> > >> > >> >>> > > >>>>>>>> Python >> > >> > >> >>> > strawman. >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> Sync vs async: while I don't touch on it, you >> > could >> > >> > >> >>> > > >>>>>>>> interleave >> > >> > >> >>> > > >>>> uploads >> > >> > >> >>> > > >>>>>>>> and downloads if you were so inclined. Right >> now, >> > >> > >> >>> > > >>>>>>>> synchronous >> > >> > >> >>> > APIs >> > >> > >> >>> > > >>>>>>>> make this error-prone, e.g. if both client and >> > >> > >> >>> > > >>>>>>>> server >> > >> > >> >>> > > >>>>>>>> wait >> > >> > >> >>> > > >>>>>>>> for >> > >> > >> >>> > each >> > >> > >> >>> > > >>>>>>>> other due to an application logic bug. (gRPC >> > >> > >> >>> > > >>>>>>>> doesn't >> > >> > >> >>> > > >>>>>>>> give >> > >> > >> >>> > > >>>>>>>> us >> > >> > >> >>> > > >>>>>>>> the >> > >> > >> >>> > > >>>>>>>> ability to have per-read timeouts, only an >> overall >> > >> > >> >>> > > >>>>>>>> timeout.) >> > >> > >> >>> > > >>>>>>>> As >> > >> > >> >>> > an >> > >> > >> >>> > > >>>>>>>> example of this happening with DoPut, see >> > >> > >> >>> > > >>>>>>>> ARROW-6063: >> > >> > >> >>> > > >>>>>>>> >> https://issues.apache.org/jira/browse/ARROW-6063 >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> This is mostly tangential though, eventually >> > >> > >> >>> > > >>>>>>>> we >> > >> > >> >>> > > >>>>>>>> will >> > >> > >> >>> > > >>>>>>>> want >> > >> > >> >>> > > >>>>>>>> to >> > >> > >> >>> > design >> > >> > >> >>> > > >>>>>>>> asynchronous APIs for Flight as a whole. A >> > >> > bidirectional >> > >> > >> >>> > > >>>>>>>> stream >> > >> > >> >>> > > >>>>>>>> like >> > >> > >> >>> > > >>>>>>>> this (and like DoPut) just makes these >> > >> > >> >>> > > >>>>>>>> pitfalls >> > >> > >> >>> > > >>>>>>>> easier >> > >> > >> >>> > > >>>>>>>> to >> > >> > >> >>> > > >>>>>>>> run >> > >> > >> >>> > into. >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> Using DoPut+DoGet: I discussed this in the >> > >> > >> >>> > > >>>>>>>> proposal, >> > >> > but >> > >> > >> >>> > > >>>>>>>> the >> > >> > >> >>> > main >> > >> > >> >>> > > >>>>>>>> concern is that depending on how you deploy, >> > >> > >> >>> > > >>>>>>>> two >> > >> > >> >>> > > >>>>>>>> separate >> > >> > >> >>> > > >>>>>>>> calls >> > >> > >> >>> > > >>>>>>>> could >> > >> > >> >>> > > >>>>>>>> get routed to different instances. >> > >> > >> >>> > > >>>>>>>> Additionally, >> > >> > >> >>> > > >>>>>>>> gRPC >> > >> > >> >>> > > >>>>>>>> has >> > >> > >> >>> > > >>>>>>>> some >> > >> > >> >>> > > >>>>>>>> reconnection behaviors; if the server goes >> > >> > >> >>> > > >>>>>>>> away >> in >> > >> > >> >>> > > >>>>>>>> between >> > >> > >> >>> > > >>>>>>>> the >> > >> > >> >>> > two >> > >> > >> >>> > > >>>>>>>> calls, but it then restarts or there is >> > >> > >> >>> > > >>>>>>>> another >> > >> > instance >> > >> > >> >>> > available, >> > >> > >> >>> > > >>>>>>>> the client will happily reconnect to the new >> > server >> > >> > >> without >> > >> > >> >>> > > >>>>>>>> warning. >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> Thanks, >> > >> > >> >>> > > >>>>>>>> David >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>>> On 10/15/19, Ryan Murray <rym...@dremio.com> >> > wrote: >> > >> > >> >>> > > >>>>>>>>> Hey David, >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> I think this proposal makes a lot of sense. I >> > like >> > >> > >> >>> > > >>>>>>>>> it >> > >> > >> >>> > > >>>>>>>>> and >> > >> > >> >>> > > >>>>>>>>> the >> > >> > >> >>> > > >>>>>>>>> possibility >> > >> > >> >>> > > >>>>>>>>> of remote compute via arrow buffers. One >> > >> > >> >>> > > >>>>>>>>> thing >> > >> > >> >>> > > >>>>>>>>> that >> > >> > >> >>> > > >>>>>>>>> would >> > >> > >> >>> > > >>>>>>>>> help >> > >> > >> >>> > me >> > >> > >> >>> > > >>>>>> would >> > >> > >> >>> > > >>>>>>>> be >> > >> > >> >>> > > >>>>>>>>> a concrete example of the API in a real life >> use >> > >> > >> >>> > > >>>>>>>>> case. >> > >> > >> >>> > > >>>>>>>>> Also, >> > >> > >> >>> > what >> > >> > >> >>> > > >>>>>> would >> > >> > >> >>> > > >>>>>>>> the >> > >> > >> >>> > > >>>>>>>>> client experience be in terms of sync vs >> > >> > >> >>> > > >>>>>>>>> asyc? >> > >> > >> >>> > > >>>>>>>>> Would >> > >> > >> >>> > > >>>>>>>>> the >> > >> > >> >>> > > >>>>>>>>> client >> > >> > >> >>> > > >>>>>>>>> block >> > >> > >> >>> > > >>>>>>>> till >> > >> > >> >>> > > >>>>>>>>> the bidirectional call return ie c = >> > >> > >> flight.vector_mult(a, >> > >> > >> >>> > > >>>>>>>>> b) >> > >> > >> >>> > or >> > >> > >> >>> > > >>>>>>>>> would >> > >> > >> >>> > > >>>>>>>> the >> > >> > >> >>> > > >>>>>>>>> client wait to be signaled that computation >> > >> > >> >>> > > >>>>>>>>> was >> > >> > >> >>> > > >>>>>>>>> done. >> > >> > >> >>> > > >>>>>>>>> If >> > >> > >> >>> > > >>>>>>>>> the >> > >> > >> >>> > > >>>>>>>>> later >> > >> > >> >>> > > >>>>>>>>> how >> > >> > >> >>> > > >>>>>>>>> is >> > >> > >> >>> > > >>>>>>>>> that different from a DoPut then DoGet? I >> suppose >> > >> > >> >>> > > >>>>>>>>> that >> > >> > >> >>> > > >>>>>>>>> this >> > >> > >> >>> > could >> > >> > >> >>> > > >>>> be >> > >> > >> >>> > > >>>>>>>>> implemented without extending the RPC >> > >> > >> >>> > > >>>>>>>>> interface >> > >> > >> >>> > > >>>>>>>>> but >> > >> > >> rather >> > >> > >> >>> > > >>>>>>>>> by a >> > >> > >> >>> > > >>>>>>>>> function/util? >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> Best, >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> Ryan >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> On Sun, Oct 13, 2019 at 9:24 PM David Li < >> > >> > >> >>> > li.david...@gmail.com> >> > >> > >> >>> > > >>>>>> wrote: >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> Hi all, >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> We've been using Flight quite successfully >> > >> > >> >>> > > >>>>>>>>>> so >> > >> > >> >>> > > >>>>>>>>>> far, >> > >> > but >> > >> > >> we >> > >> > >> >>> > > >>>>>>>>>> have >> > >> > >> >>> > > >>>>>>>>>> identified a new use case on the horizon: >> being >> > >> > >> >>> > > >>>>>>>>>> able >> > >> > >> >>> > > >>>>>>>>>> to >> > >> > >> >>> > > >>>>>>>>>> both >> > >> > >> >>> > > >>>>>>>>>> send >> > >> > >> >>> > > >>>>>>>>>> and >> > >> > >> >>> > > >>>>>>>>>> retrieve Arrow data within a single RPC >> > >> > >> >>> > > >>>>>>>>>> call. >> To >> > >> > >> >>> > > >>>>>>>>>> that >> > >> > >> >>> > > >>>>>>>>>> end, >> > >> > >> >>> > I've >> > >> > >> >>> > > >>>>>>>>>> written up a proposal for a new RPC method: >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>> >> > >> > >> >>> > >> > >> > >> >> > >> > >> > >> https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> Please let me know if you can't view or >> comment >> > >> > >> >>> > > >>>>>>>>>> on >> > >> > the >> > >> > >> >>> > document. >> > >> > >> >>> > > >>>>>>>>>> I'd >> > >> > >> >>> > > >>>>>>>>>> appreciate any feedback; I think this is a >> > >> > >> >>> > > >>>>>>>>>> relatively >> > >> > >> >>> > > >>>>>>>>>> straightforward >> > >> > >> >>> > > >>>>>>>>>> addition - it is essentially "DoPutThenGet". >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> This is a format change and would require a >> > vote. >> > >> > I've >> > >> > >> >>> > > >>>>>>>>>> decided >> > >> > >> >>> > > >>>>>>>>>> to >> > >> > >> >>> > > >>>>>>>>>> table the other format change I had proposed >> (on >> > >> > >> >>> > > >>>>>>>>>> DoPut), >> > >> > >> >>> > > >>>>>>>>>> as >> > >> > >> >>> > > >>>>>>>>>> it >> > >> > >> >>> > > >>>>>> doesn't >> > >> > >> >>> > > >>>>>>>>>> functionally change Flight, just the >> > >> > >> >>> > > >>>>>>>>>> interpretation >> > >> > of >> > >> > >> >>> > > >>>>>>>>>> the >> > >> > >> >>> > > >>>>>>>>>> semantics. >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>>> Thanks, >> > >> > >> >>> > > >>>>>>>>>> David >> > >> > >> >>> > > >>>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> -- >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> Ryan Murray | Principal Consulting Engineer >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> +447540852009 | rym...@dremio.com >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>>> <https://www.dremio.com/> >> > >> > >> >>> > > >>>>>>>>> Check out our GitHub >> > >> > >> >>> > > >>>>>>>>> <https://www.github.com/dremio>, >> > >> > >> join >> > >> > >> >>> > > >>>>>>>>> our >> > >> > >> >>> > > >>>>>>>>> community >> > >> > >> >>> > > >>>>>>>>> site <https://community.dremio.com/> & >> Download >> > >> > Dremio >> > >> > >> >>> > > >>>>>>>>> <https://www.dremio.com/download> >> > >> > >> >>> > > >>>>>>>>> >> > >> > >> >>> > > >>>>>>>> >> > >> > >> >>> > > >>>>>>> >> > >> > >> >>> > > >>>>>> >> > >> > >> >>> > > >>>>> >> > >> > >> >>> > > >>>> >> > >> > >> >>> > > >>> >> > >> > >> >>> > > > >> > >> > >> >>> > >> > >> > >> >> >> > >> > >> > >> > >> > >> >> > >> > > >> > >> > >> > > >> > >> >