There have been multiple scenarios where people changed Beam, and ended up
breaking the Dataflow runner because that code lived in a private
repository. I believe that putting the Dataflow runner code in the public
repository will make it easier and simpler to make changes to Apache Beam.

Reuven

On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <lc...@google.com> wrote:

> At Google we have been importing the Apache Beam code base and integrating
> it with the Google portion of the codebase that supports the Dataflow
> worker. This process is painful as we regularly are making breaking API
> changes to support libraries related to running portable pipelines (and
> sometimes in other places as well). This has made it sometimes difficult
> for PR changes to make changes without either breaking something for Google
> or waiting for a Googler to make the change internally (e.g. dependency
> updates).
>
> This code is very similar to the other integrations that exist for runners
> such as Flink/Spark/Apex/Samza. It is an adaption layer that sits on top of
> an execution engine. There is no super secret awesome stuff as this code
> was already publicly visible in the past when it was part of the Google
> Cloud Dataflow github repo[1].
>
> Process wise the code will need to get approval from Google to be donated
> and for it to go through the code donation process but before we attempt to
> do that, I was wondering whether the community would object to adding this
> code to the master branch?
>
> The up side is that people can make breaking changes and fix it for all
> runners. It will also help Googlers contribute more to the portability
> story as it will remove the burden of doing the code import (wasted time)
> and it will allow people to develop in master (can have the whole project
> loaded in a single IDE).
>
> The downsides are that this will represent more code and unit tests to
> support.
>
> 1:
> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker
>

Reply via email to