Your assumption is correct, as duplicates in a failure scenario will occur.

Thanks,
Rufus

On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek <[email protected]>
wrote:

> Hi,
> as I understand it the HDFS sink uses the transaction system to verify
> that all the elements in a transaction are written. This is what I would
> call at-least-once semantics.
>
> My question is now what happens if the writing fails in the middle of
> writing the elements in the transaction. When the transaction is retried
> some of the elements might be written again, i.e. the output contains
> duplicates. Is this assumption correct or is there something in place that
> prevents this from happening?
>
> Thanks for your time,
> Aljoscha
>

Reply via email to