On Fri, May 18, 2018 at 12:21 PM Robert Bradshaw <rober...@google.com>
wrote:

> On Fri, May 18, 2018 at 11:46 AM Raghu Angadi <rang...@google.com> wrote:
>
>> Thanks Kenn.
>>
>> On Fri, May 18, 2018 at 11:02 AM Kenneth Knowles <k...@google.com> wrote:
>>
>>> The fact that its usage has grown probably indicates that we have a
>>> large number of transforms that can easily cause data loss / duplication.
>>>
>>
>> Is this specific to Reshuffle or it is true for any GroupByKey? I see
>> Reshuffle as just a wrapper around GBK.
>>
> The issue is when it's used in such a way that data corruption can occur
> when the underlying GBK output is not stable.
>

Could you describe this breakage bit more in detail or give a example?
Apologies in advance, I know this came up in multiple contexts in the past,
but I haven't grokked the issue well. It is the window rewrite that
Reshuffle does that causes misuse of GBK?

Thanks.

Reply via email to