If usable by itself without google karma (can you use a worker without
dataflow itself?) it sounds awesome otherwise it sounds weird IMHO.

Le jeu. 13 sept. 2018 21:36, Kai Jiang <[email protected]> a écrit :

> +1 (non googler)
>
> big help for transparency and for future runners.
>
> Best,
> Kai
>
> On Thu, Sep 13, 2018, 11:45 Xinyu Liu <[email protected]> wrote:
>
>> Big +1 (non-googler).
>>
>> From Samza Runner's perspective, we are very happy to see dataflow worker
>> code so we can learn and compete :).
>>
>> Thanks,
>> Xinyu
>>
>> On Thu, Sep 13, 2018 at 11:34 AM Suneel Marthi <[email protected]>
>> wrote:
>>
>>> +1 (non-googler)
>>>
>>> This is a great 👍 move
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 13, 2018, at 2:25 PM, Tim Robertson <[email protected]>
>>> wrote:
>>>
>>> +1 (non googler)
>>> It sounds pragmatic, helps with transparency should issues arise and
>>> enables more people to fix.
>>>
>>>
>>> On Thu, Sep 13, 2018 at 8:15 PM Dan Halperin <[email protected]>
>>> wrote:
>>>
>>>> From my perspective as a (non-Google) community member, huge +1.
>>>>
>>>> I don't see anything bad for the community about open sourcing more of
>>>> the probably-most-used runner. While the DirectRunner is probably still the
>>>> most referential implementation of Beam, can't hurt to see more working
>>>> code. Other runners or runner implementors can refer to this code if they
>>>> want, and ignore it if they don't.
>>>>
>>>> In terms of having more code and tests to support, well, that's par for
>>>> the course. Will this change make the things that need to be done to
>>>> support them more obvious? (E.g., "this PR is blocked because someone at
>>>> Google on Dataflow team has to fix something" vs "this PR is blocked
>>>> because the Apache Beam code in foo/bar/baz is failing, and anyone who can
>>>> see the code can fix it"). The latter seems like a clear win for the
>>>> community.
>>>>
>>>> (As long as the code donation is handled properly, but that's
>>>> completely orthogonal and I have no reason to think it wouldn't be.)
>>>>
>>>> Thanks,
>>>> Dan
>>>>
>>>> On Thu, Sep 13, 2018 at 11:06 AM Lukasz Cwik <[email protected]> wrote:
>>>>
>>>>> Yes, I'm specifically asking the community for opinions as to whether
>>>>> it should be accepted or not.
>>>>>
>>>>> On Thu, Sep 13, 2018 at 10:51 AM Raghu Angadi <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> This is terrific!
>>>>>>
>>>>>> Is thread asking for opinions from the community about if it should
>>>>>> be accepted? Assuming Google side decision is made to contribute, big +1
>>>>>> from me to include it next to other runners.
>>>>>>
>>>>>> On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> At Google we have been importing the Apache Beam code base and
>>>>>>> integrating it with the Google portion of the codebase that supports the
>>>>>>> Dataflow worker. This process is painful as we regularly are making
>>>>>>> breaking API changes to support libraries related to running portable
>>>>>>> pipelines (and sometimes in other places as well). This has made it
>>>>>>> sometimes difficult for PR changes to make changes without either 
>>>>>>> breaking
>>>>>>> something for Google or waiting for a Googler to make the change 
>>>>>>> internally
>>>>>>> (e.g. dependency updates).
>>>>>>>
>>>>>>> This code is very similar to the other integrations that exist for
>>>>>>> runners such as Flink/Spark/Apex/Samza. It is an adaption layer that 
>>>>>>> sits
>>>>>>> on top of an execution engine. There is no super secret awesome stuff as
>>>>>>> this code was already publicly visible in the past when it was part of 
>>>>>>> the
>>>>>>> Google Cloud Dataflow github repo[1].
>>>>>>>
>>>>>>> Process wise the code will need to get approval from Google to be
>>>>>>> donated and for it to go through the code donation process but before we
>>>>>>> attempt to do that, I was wondering whether the community would object 
>>>>>>> to
>>>>>>> adding this code to the master branch?
>>>>>>>
>>>>>>> The up side is that people can make breaking changes and fix it for
>>>>>>> all runners. It will also help Googlers contribute more to the 
>>>>>>> portability
>>>>>>> story as it will remove the burden of doing the code import (wasted 
>>>>>>> time)
>>>>>>> and it will allow people to develop in master (can have the whole 
>>>>>>> project
>>>>>>> loaded in a single IDE).
>>>>>>>
>>>>>>> The downsides are that this will represent more code and unit tests
>>>>>>> to support.
>>>>>>>
>>>>>>> 1:
>>>>>>> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker
>>>>>>>
>>>>>>

Reply via email to