Oneof doesn't actually change the wire encoding; it would just be
application-level logic. (The official guide doesn't even mention it
in the encoding docs; I found
https://stackoverflow.com/questions/52226409/how-protobuf-encodes-oneof-message-construct
as well.)

If I follow you, Jacques, then you are proposing essentially inlining
the definition of Any, e.g.

message FlightMessage {
  oneof message {
    FlightData data = 1;
    FlightAny metadata = 2;
  }
}

message FlightAny {
  string type = 1;
  bytes data = 2;
}

Is this correct?

It might be nice to consider the wrapper message for DoGet/DoPut as
well, but at that point, I'd rather we be consistent with all of them,
rather than have one of the three methods do its own thing.

Thanks,
David

On 10/20/19, Jacques Nadeau <jacq...@apache.org> wrote:
> I think we could probably expose the oneof behavior without exposing the
> protobuf functions. On the any... hmm. I guess we could expose as two
> fields: type and data. Then users could use it for whatever but if people
> wanted to treat it as any, it would work. (Basically a user could use any
> with it easily but they could also use any other mechanism). At least in
> java, the any concepts are pretty simple/diy. Are other language bindings
> less diy?
>
> I'm *not* hardcore against the empty FlightData + metadata but it just
> seemed a bit janky.
>
> Thinking about the control message/wrapper object thing, I wonder if we
> should redefine DoPut and DoGet to have the same property if we think it is
> a good idea...
>
> On Wed, Oct 16, 2019 at 5:13 PM David Li <li.david...@gmail.com> wrote:
>
>> I was definitely considering having control messages without data, and
>> I thought that could be encoded by a FlightData with only app_metadata
>> set. I think I understand your position now: FlightData should always
>> carry (some) data (with optional metadata)?
>>
>> That makes sense to me, and is consistent with the documentation on
>> FlightData in the Protobuf file. I was worried about having a
>> redundant metadata field, but oneof prevents that from happening, and
>> overall having a clear separation between data and control messages is
>> cleaner.
>>
>> As for using Protobuf's Any: so far, we've refrained from exposing
>> Protobuf by using bytes, would we want to change that now?
>>
>> Best,
>> David
>>
>> On 10/16/19, Jacques Nadeau <jacq...@apache.org> wrote:
>> > Hey David,
>> >
>> > RE: Async: I was trying to match the pattern we use for doget/doput for
>> > async. Yes, more thinking java given java grpc's async always pattern.
>> >
>> > On the comment around the FlightData, I think it is overloading the
>> message
>> > to use metadata for this. If I want to send a control message
>> independently
>> > of the data message, I would have to define something like an empty
>> flight
>> > data message that has custom metadata. Why not support a container
>> > object
>> > with a oneof{FlightData, Any} in it instead so users can add more data
>> > as
>> > desired. The default impl could be a noop for the Any messages.
>> >
>> > On Tue, Oct 15, 2019 at 6:50 PM David Li <li.david...@gmail.com> wrote:
>> >
>> >> Hi Jacques,
>> >>
>> >> Thanks for the comments.
>> >>
>> >> - I do agree DoExchange is a better name!
>> >> - FlightData already has metadata fields as a result of prior
>> >> proposals, so I don't think we need a new message to carry that kind
>> >> of information.
>> >> - I like the suggestion of an async handler to handle incoming
>> >> messages as the fundamental API; it would actually be quite natural to
>> >> implement in Flight/Java. I will note that it's not possible in
>> >> C++/Python without spawning a thread, though. (In essence, gRPC-Java
>> >> is async-always and gRPC-C++ is sync-always.) There are experimental
>> >> C++ APIs that would let us do something similar to Java, but those are
>> >> only in relatively recent gRPC versions and are still under
>> >> development (contrary to the interceptor APIs which have been around
>> >> for quite a while).
>> >>
>> >> Thanks,
>> >> David
>> >>
>> >> On 10/15/19, Jacques Nadeau <jacq...@apache.org> wrote:
>> >> > I like it. Added some comments to the doc. Might worth discussion
>> >> > here
>> >> > depending on your thoughts.
>> >> >
>> >> > On Tue, Oct 15, 2019 at 7:11 AM David Li <li.david...@gmail.com>
>> wrote:
>> >> >
>> >> >> Hey Ryan,
>> >> >>
>> >> >> Thanks for the comments.
>> >> >>
>> >> >> Concrete example: I've edited the doc to provide a Python strawman.
>> >> >>
>> >> >> Sync vs async: while I don't touch on it, you could interleave
>> uploads
>> >> >> and downloads if you were so inclined. Right now, synchronous APIs
>> >> >> make this error-prone, e.g. if both client and server wait for each
>> >> >> other due to an application logic bug. (gRPC doesn't give us the
>> >> >> ability to have per-read timeouts, only an overall timeout.) As an
>> >> >> example of this happening with DoPut, see ARROW-6063:
>> >> >> https://issues.apache.org/jira/browse/ARROW-6063
>> >> >>
>> >> >> This is mostly tangential though, eventually we will want to design
>> >> >> asynchronous APIs for Flight as a whole. A bidirectional stream
>> >> >> like
>> >> >> this (and like DoPut) just makes these pitfalls easier to run into.
>> >> >>
>> >> >> Using DoPut+DoGet: I discussed this in the proposal, but the main
>> >> >> concern is that depending on how you deploy, two separate calls
>> >> >> could
>> >> >> get routed to different instances. Additionally, gRPC has some
>> >> >> reconnection behaviors; if the server goes away in between the two
>> >> >> calls, but it then restarts or there is another instance available,
>> >> >> the client will happily reconnect to the new server without
>> >> >> warning.
>> >> >>
>> >> >> Thanks,
>> >> >> David
>> >> >>
>> >> >> On 10/15/19, Ryan Murray <rym...@dremio.com> wrote:
>> >> >> > Hey David,
>> >> >> >
>> >> >> > I think this proposal makes a lot of sense. I like it and the
>> >> >> > possibility
>> >> >> > of remote compute via arrow buffers. One thing that would help me
>> >> would
>> >> >> be
>> >> >> > a concrete example of the API in a real life use case. Also, what
>> >> would
>> >> >> the
>> >> >> > client experience be in terms of sync vs asyc? Would the client
>> >> >> > block
>> >> >> till
>> >> >> > the bidirectional call return ie c = flight.vector_mult(a, b) or
>> >> >> > would
>> >> >> the
>> >> >> > client wait to be signaled that computation was done. If the
>> >> >> > later
>> >> >> > how
>> >> >> > is
>> >> >> > that different from a DoPut then DoGet? I suppose that this could
>> be
>> >> >> > implemented without extending the RPC interface but rather by a
>> >> >> > function/util?
>> >> >> >
>> >> >> >
>> >> >> > Best,
>> >> >> >
>> >> >> > Ryan
>> >> >> >
>> >> >> > On Sun, Oct 13, 2019 at 9:24 PM David Li <li.david...@gmail.com>
>> >> wrote:
>> >> >> >
>> >> >> >> Hi all,
>> >> >> >>
>> >> >> >> We've been using Flight quite successfully so far, but we have
>> >> >> >> identified a new use case on the horizon: being able to both
>> >> >> >> send
>> >> >> >> and
>> >> >> >> retrieve Arrow data within a single RPC call. To that end, I've
>> >> >> >> written up a proposal for a new RPC method:
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing
>> >> >> >>
>> >> >> >> Please let me know if you can't view or comment on the document.
>> >> >> >> I'd
>> >> >> >> appreciate any feedback; I think this is a relatively
>> >> >> >> straightforward
>> >> >> >> addition - it is essentially "DoPutThenGet".
>> >> >> >>
>> >> >> >> This is a format change and would require a vote. I've decided
>> >> >> >> to
>> >> >> >> table the other format change I had proposed (on DoPut), as it
>> >> doesn't
>> >> >> >> functionally change Flight, just the interpretation of the
>> >> >> >> semantics.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> David
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> > Ryan Murray  | Principal Consulting Engineer
>> >> >> >
>> >> >> > +447540852009 | rym...@dremio.com
>> >> >> >
>> >> >> > <https://www.dremio.com/>
>> >> >> > Check out our GitHub <https://www.github.com/dremio>, join our
>> >> >> > community
>> >> >> > site <https://community.dremio.com/> & Download Dremio
>> >> >> > <https://www.dremio.com/download>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Reply via email to