On Fri, May 18, 2018 at 11:46 AM Raghu Angadi <rang...@google.com> wrote:
> Thanks Kenn. > > On Fri, May 18, 2018 at 11:02 AM Kenneth Knowles <k...@google.com> wrote: > >> The fact that its usage has grown probably indicates that we have a large >> number of transforms that can easily cause data loss / duplication. >> > > Is this specific to Reshuffle or it is true for any GroupByKey? I see > Reshuffle as just a wrapper around GBK. > The issue is when it's used in such a way that data corruption can occur when the underlying GBK output is not stable.