Re: Implementation for exactly-once streaming sink

2018-12-06 Thread Eric Wohlstadter
the Sink just needs to ignore the batch if a > batch Id is present in the datastore. > > Also the discussion in the mail thread you posted assumes that each > intermediate operation is idempotent otherwise the data generated when the > batch replays can be different. > > On Thu, 6

Re: Implementation for exactly-once streaming sink

2018-12-06 Thread Eric Wohlstadter
f failed tasks. Any failure > will lead to the query being stopped and it needs to be manually restarted > from the checkpoint." > > BR, > G > > > On Wed, Dec 5, 2018 at 8:36 PM Eric Wohlstadter > wrote: > >> Hi all, >> We are working on implementing a str

Implementation for exactly-once streaming sink

2018-12-05 Thread Eric Wohlstadter
Hi all, We are working on implementing a streaming sink on 2.3.1 with the DataSourceV2 APIs. Can anyone help check if my understanding is correct, with respect to the failure modes which need to be covered? We are assuming that a Reliable Receiver (such as Kafka) is used as the stream source. An