+1 (non googler)

big help for transparency and for future runners.

Best,
Kai

On Thu, Sep 13, 2018, 11:45 Xinyu Liu <xinyuliu...@gmail.com> wrote:

> Big +1 (non-googler).
>
> From Samza Runner's perspective, we are very happy to see dataflow worker
> code so we can learn and compete :).
>
> Thanks,
> Xinyu
>
> On Thu, Sep 13, 2018 at 11:34 AM Suneel Marthi <suneel.mar...@gmail.com>
> wrote:
>
>> +1 (non-googler)
>>
>> This is a great 👍 move
>>
>> Sent from my iPhone
>>
>> On Sep 13, 2018, at 2:25 PM, Tim Robertson <timrobertson...@gmail.com>
>> wrote:
>>
>> +1 (non googler)
>> It sounds pragmatic, helps with transparency should issues arise and
>> enables more people to fix.
>>
>>
>> On Thu, Sep 13, 2018 at 8:15 PM Dan Halperin <dhalp...@apache.org> wrote:
>>
>>> From my perspective as a (non-Google) community member, huge +1.
>>>
>>> I don't see anything bad for the community about open sourcing more of
>>> the probably-most-used runner. While the DirectRunner is probably still the
>>> most referential implementation of Beam, can't hurt to see more working
>>> code. Other runners or runner implementors can refer to this code if they
>>> want, and ignore it if they don't.
>>>
>>> In terms of having more code and tests to support, well, that's par for
>>> the course. Will this change make the things that need to be done to
>>> support them more obvious? (E.g., "this PR is blocked because someone at
>>> Google on Dataflow team has to fix something" vs "this PR is blocked
>>> because the Apache Beam code in foo/bar/baz is failing, and anyone who can
>>> see the code can fix it"). The latter seems like a clear win for the
>>> community.
>>>
>>> (As long as the code donation is handled properly, but that's completely
>>> orthogonal and I have no reason to think it wouldn't be.)
>>>
>>> Thanks,
>>> Dan
>>>
>>> On Thu, Sep 13, 2018 at 11:06 AM Lukasz Cwik <lc...@google.com> wrote:
>>>
>>>> Yes, I'm specifically asking the community for opinions as to whether
>>>> it should be accepted or not.
>>>>
>>>> On Thu, Sep 13, 2018 at 10:51 AM Raghu Angadi <rang...@google.com>
>>>> wrote:
>>>>
>>>>> This is terrific!
>>>>>
>>>>> Is thread asking for opinions from the community about if it should be
>>>>> accepted? Assuming Google side decision is made to contribute, big +1 from
>>>>> me to include it next to other runners.
>>>>>
>>>>> On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> At Google we have been importing the Apache Beam code base and
>>>>>> integrating it with the Google portion of the codebase that supports the
>>>>>> Dataflow worker. This process is painful as we regularly are making
>>>>>> breaking API changes to support libraries related to running portable
>>>>>> pipelines (and sometimes in other places as well). This has made it
>>>>>> sometimes difficult for PR changes to make changes without either 
>>>>>> breaking
>>>>>> something for Google or waiting for a Googler to make the change 
>>>>>> internally
>>>>>> (e.g. dependency updates).
>>>>>>
>>>>>> This code is very similar to the other integrations that exist for
>>>>>> runners such as Flink/Spark/Apex/Samza. It is an adaption layer that sits
>>>>>> on top of an execution engine. There is no super secret awesome stuff as
>>>>>> this code was already publicly visible in the past when it was part of 
>>>>>> the
>>>>>> Google Cloud Dataflow github repo[1].
>>>>>>
>>>>>> Process wise the code will need to get approval from Google to be
>>>>>> donated and for it to go through the code donation process but before we
>>>>>> attempt to do that, I was wondering whether the community would object to
>>>>>> adding this code to the master branch?
>>>>>>
>>>>>> The up side is that people can make breaking changes and fix it for
>>>>>> all runners. It will also help Googlers contribute more to the 
>>>>>> portability
>>>>>> story as it will remove the burden of doing the code import (wasted time)
>>>>>> and it will allow people to develop in master (can have the whole project
>>>>>> loaded in a single IDE).
>>>>>>
>>>>>> The downsides are that this will represent more code and unit tests
>>>>>> to support.
>>>>>>
>>>>>> 1:
>>>>>> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker
>>>>>>
>>>>>

Reply via email to