Cool, makes a ton of sense now. Thanks! On Tue, Oct 15, 2019 at 3:11 PM David Li <li.david...@gmail.com> wrote:
> Hey Ryan, > > Thanks for the comments. > > Concrete example: I've edited the doc to provide a Python strawman. > > Sync vs async: while I don't touch on it, you could interleave uploads > and downloads if you were so inclined. Right now, synchronous APIs > make this error-prone, e.g. if both client and server wait for each > other due to an application logic bug. (gRPC doesn't give us the > ability to have per-read timeouts, only an overall timeout.) As an > example of this happening with DoPut, see ARROW-6063: > https://issues.apache.org/jira/browse/ARROW-6063 > > This is mostly tangential though, eventually we will want to design > asynchronous APIs for Flight as a whole. A bidirectional stream like > this (and like DoPut) just makes these pitfalls easier to run into. > > Using DoPut+DoGet: I discussed this in the proposal, but the main > concern is that depending on how you deploy, two separate calls could > get routed to different instances. Additionally, gRPC has some > reconnection behaviors; if the server goes away in between the two > calls, but it then restarts or there is another instance available, > the client will happily reconnect to the new server without warning. > > Thanks, > David > > On 10/15/19, Ryan Murray <rym...@dremio.com> wrote: > > Hey David, > > > > I think this proposal makes a lot of sense. I like it and the possibility > > of remote compute via arrow buffers. One thing that would help me would > be > > a concrete example of the API in a real life use case. Also, what would > the > > client experience be in terms of sync vs asyc? Would the client block > till > > the bidirectional call return ie c = flight.vector_mult(a, b) or would > the > > client wait to be signaled that computation was done. If the later how is > > that different from a DoPut then DoGet? I suppose that this could be > > implemented without extending the RPC interface but rather by a > > function/util? > > > > > > Best, > > > > Ryan > > > > On Sun, Oct 13, 2019 at 9:24 PM David Li <li.david...@gmail.com> wrote: > > > >> Hi all, > >> > >> We've been using Flight quite successfully so far, but we have > >> identified a new use case on the horizon: being able to both send and > >> retrieve Arrow data within a single RPC call. To that end, I've > >> written up a proposal for a new RPC method: > >> > >> > https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing > >> > >> Please let me know if you can't view or comment on the document. I'd > >> appreciate any feedback; I think this is a relatively straightforward > >> addition - it is essentially "DoPutThenGet". > >> > >> This is a format change and would require a vote. I've decided to > >> table the other format change I had proposed (on DoPut), as it doesn't > >> functionally change Flight, just the interpretation of the semantics. > >> > >> Thanks, > >> David > >> > > > > > > -- > > > > Ryan Murray | Principal Consulting Engineer > > > > +447540852009 | rym...@dremio.com > > > > <https://www.dremio.com/> > > Check out our GitHub <https://www.github.com/dremio>, join our community > > site <https://community.dremio.com/> & Download Dremio > > <https://www.dremio.com/download> > > > -- Ryan Murray | Principal Consulting Engineer +447540852009 | rym...@dremio.com <https://www.dremio.com/> Check out our GitHub <https://www.github.com/dremio>, join our community site <https://community.dremio.com/> & Download Dremio <https://www.dremio.com/download>