Thank you Yi! Please also confirm that all this is done at the exact same 
interval - that is checkpoint interval.

Are the 3 operations you listed done in sequence or in parallel? If sequential, 
then what is the order? Can you also comment on server failure scenarios while 
this is being carried out - that is, a failure results in a subset of these 
operations not completing, in which case the newly spawned Samza container may 
have stale state. How likely is that, have you faced it?

-----Original Message-----
From: Yi Pan [mailto:nickpa...@gmail.com] 
Sent: Wednesday, July 06, 2016 1:10 PM
To: dev@samza.apache.org
Subject: Re: flushing changelog & checkpointing

Hi, Buvana,

Please see answers below.

On Tue, Jul 5, 2016 at 11:47 AM, Ramanan, Buvana (Nokia - US) < 
buvana.rama...@nokia-bell-labs.com> wrote:

>
> Does this mean that all writes to the disk for state store purposes 
> will be done at the checkpointing time (which is also the time Samza 
> checkpoints the incoming stream offsets)? Does this also mean new data 
> to the changelog stream will be emitted at checkpointing time?
>
>
Yes, the commit workflow in Samza guarantees that a) pending writes to on-disk 
KV-stores are flushed; b) all pending writes to output streams are flushed, 
including the writes to changelog streams; c) all input offsets are flushed to 
checkpoint.

Hope the above answers your questions.

-Yi

Reply via email to