Thanks for all the pointers. I have finally gotten to implement a POC based on GRPC and I am super-happy with it so far. It has all the modern support we need in Airflow and seems performant enough to serve our case.
J. On Thu, Feb 17, 2022 at 6:52 PM Kenneth Knowles <[email protected]> wrote: > > Another TL;DR that may not be covered in the history is that we initially set > out with a couple of goals that have since been abandoned: > > 1. Allow Beam to be used in a particular language/ecosystem without a > dependency on the portability framework (NO - we want everything to use the > portability framework) > 2. Allow Beam's portable model to be independent of transport (NO - using > protobuf for the messages it really only makes sense to use protobuf + gRPC > for transport) > 2a. Potentially allow Beam's portable model to be represented in multiple > serialization formats (NO - there are enough impedance mismatches that it is > just not worthwhile, even though proto has lots of problems at least we can > develop workarounds only once) > > We never did develop with anything other than proto+gRPC in mind. > > Kenn > > On Thu, Feb 17, 2022 at 4:55 AM Jarek Potiuk <[email protected]> wrote: >> >> Thank you ! I will dive deeper - but having just those pointers is a good >> start (I likely mixed up gRPC - Thrift bridges with replacing of Thrift >> Luke!) >> >> On Thu, Feb 17, 2022 at 5:28 AM Kenneth Knowles <[email protected]> wrote: >>> >>> I can find you that fun mailing list pointer, if you like. Here's a >>> starting point with the subject "[DISCUSS] Beam data plane serialization >>> tech" >>> >>> https://lists.apache.org/thread/dz24chmm18skzgcmxl2jxookd3yn79r1 >>> >>> Kenn >>> >>> On Wed, Feb 16, 2022 at 10:23 AM Luke Cwik <[email protected]> wrote: >>>> >>>> Apache Beam never had an RPC layer for the internal workings of the >>>> project until the portability project[1] started so there never was a >>>> transition from Apache Thrift to gRPC. >>>> >>>> Generally the support for HTTP2 and long lived streaming connections were >>>> the key differentiators for gRPC. >>>> >>>> 1: https://beam.apache.org/roadmap/portability/ >>>> >>>> On Wed, Feb 16, 2022 at 2:38 AM Jarek Potiuk <[email protected]> wrote: >>>>> >>>>> Hello Beam friends, >>>>> >>>>> I have a question, we are preparing (as part of >>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API) >>>>> to split Airflow into more components which will be communicating using >>>>> RPC. >>>>> >>>>> Basically we need to extract some of the internal methods into a "remote >>>>> procedure calls" which then we would like to be able to call either >>>>> "really remotely" (over HTTPS) or locally (via local TCP/Unix domain >>>>> sockets). >>>>> >>>>> I have narrowed down the options we have to Apache Thrift and gRPC. I >>>>> know that Apache Beam was (is ?) in a transition period Thrift -> GRPC >>>>> and I am sure you have some experiences to share and (following your >>>>> mailing lists) I am sure there was a deep analysis done for those two >>>>> before you decided to switch. >>>>> >>>>> Before I start searching through your mailing list, maybe someone knows a >>>>> document or some summary of the two that you could share with us - that >>>>> probably could save us a lot of effort deciding which of those two might >>>>> be better for our needs. >>>>> >>>>> Is there something that you know of easily that can be shared? >>>>> >>>>> J, >>>>>
