I think there is a difference:

- If failure occurs after finishBundle() but before the consumption is
committed, then the bundle may be reprocessed, which leads to duplicated
calls to processElement() and finishBundle().
- If failure occurs after consumption is committed but before
finishBundle(), then those elements which may have buffered state in the
DoFn but not had their side-effects fully processed (since the
finishBundle() was responsible for that) are lost.



On Wed, Jun 8, 2016 at 10:09 AM Raghu Angadi <rang...@google.com.invalid>
wrote:

> On Wed, Jun 8, 2016 at 10:05 AM, Raghu Angadi <rang...@google.com> wrote:
> >
> > I thought finishBundle() exists simply as best-effort indication from the
> > runner to user some chunk of records have been processed..
>
> also to help with DoFn's own clean up if there is any.
>

Reply via email to