Another TL;DR that may not be covered in the history is that we initially set out with a couple of goals that have since been abandoned:
1. Allow Beam to be used in a particular language/ecosystem without a dependency on the portability framework (NO - we want everything to use the portability framework) 2. Allow Beam's portable model to be independent of transport (NO - using protobuf for the messages it really only makes sense to use protobuf + gRPC for transport) 2a. Potentially allow Beam's portable model to be represented in multiple serialization formats (NO - there are enough impedance mismatches that it is just not worthwhile, even though proto has lots of problems at least we can develop workarounds only once) We never did develop with anything other than proto+gRPC in mind. Kenn On Thu, Feb 17, 2022 at 4:55 AM Jarek Potiuk <[email protected]> wrote: > Thank you ! I will dive deeper - but having just those pointers is a good > start (I likely mixed up gRPC - Thrift bridges with replacing of Thrift > Luke!) > > On Thu, Feb 17, 2022 at 5:28 AM Kenneth Knowles <[email protected]> wrote: > >> I can find you that fun mailing list pointer, if you like. Here's a >> starting point with the subject "[DISCUSS] Beam data plane serialization >> tech" >> >> https://lists.apache.org/thread/dz24chmm18skzgcmxl2jxookd3yn79r1 >> >> Kenn >> >> On Wed, Feb 16, 2022 at 10:23 AM Luke Cwik <[email protected]> wrote: >> >>> Apache Beam never had an RPC layer for the internal workings of the >>> project until the portability project[1] started so there never was a >>> transition from Apache Thrift to gRPC. >>> >>> Generally the support for HTTP2 and long lived streaming connections >>> were the key differentiators for gRPC. >>> >>> 1: https://beam.apache.org/roadmap/portability/ >>> >>> On Wed, Feb 16, 2022 at 2:38 AM Jarek Potiuk <[email protected]> wrote: >>> >>>> Hello Beam friends, >>>> >>>> I have a question, we are preparing (as part of >>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API) >>>> to split Airflow into more components which will be communicating using >>>> RPC. >>>> >>>> Basically we need to extract some of the internal methods into a >>>> "remote procedure calls" which then we would like to be able to call >>>> either "really remotely" (over HTTPS) or locally (via local TCP/Unix domain >>>> sockets). >>>> >>>> I have narrowed down the options we have to Apache Thrift and gRPC. I >>>> know that Apache Beam was (is ?) in a transition period Thrift -> GRPC and >>>> I am sure you have some experiences to share and (following your mailing >>>> lists) I am sure there was a deep analysis done for those two before >>>> you decided to switch. >>>> >>>> Before I start searching through your mailing list, maybe someone knows >>>> a document or some summary of the two that you could share with us - that >>>> probably could save us a lot of effort deciding which of those two might be >>>> better for our needs. >>>> >>>> Is there something that you know of easily that can be shared? >>>> >>>> J, >>>> >>>>
