Awesome!
I do not know if it exists (or if you would have time to explain it here)
but would you have a link that describe the design (or code) behind?
especially with multiple ackers.

in all cases, many thanks for the explanation.

On Wed, May 18, 2016 at 11:45 AM, Arun Mahadevan <ar...@apache.org> wrote:

> Hi Olivier,
>
> > Are you talking about the $checkpoint spout or MySpout (with the
> offset)?
>
> I was referring  to the user spout (MySpout in this case).
>
> > Does it mean all the emitted tuples are acked only when the
> $checkpoint.txId event is ack (and so $checkpoint.txId acts as a barrier)?
> which means when tuples are acked (in MySpout), I am sure a state has been
> checkpoint.
>
> Yes that is right. So when tuples are ack-ed in MySpout you can move your
> offsets.
>
> >Does it mean my checkpoint interval must be lower than the tuple timeout
> (TOPOLGY_MESSAGE_TIMEOUT)?
>
> Right, if you change the defaults it should be lower than message timeout.
> The default checkpoint interval is 1s and message timeout is 30s.
>
> Thanks,
> Arun
>
>
> From: Olivier Mallassi
> Reply-To: "user@storm.apache.org"
> Date: Wednesday, May 18, 2016 at 12:57 AM
> To: "user@storm.apache.org"
> Subject: Re: State Checkpointing & spout state
>
> Hi Arun,
>
> Thank you for your answer.
> I may be able to deal with "at least once" with idempotency and a stateful
> bolt (need to look at  in details yet) but being able to checkpoint the
> state of the spout would be really helpful  ;)
>
> anyway. I may have missed something in the doc but I just need to clarify
> your phrase "It checkpoints the states of all the bolts and once that’s
> successful, the tuples emitted by the spout are acked"
>
> Are you talking about the $checkpoint spout or MySpout (with the offset)?
> Does it mean all the emitted tuples are acked only when the
> $checkpoint.txId event is ack (and so $checkpoint.txId acts as a barrier)?
> which means when tuples are acked (in MySpout), I am sure a state has been
> checkpointed.
> Does it mean my checkpoint interval must be lower than the tuple timeout
> (TOPOLGY_MESSAGE_TIMEOUT)?
>
> Many thanks for your help.
>
> Olivier.
>
> On Tue, May 17, 2016 at 2:12 PM, Arun Mahadevan <ar...@apache.org> wrote:
>
>> Hi Oliver,
>>
>> The state checkpointing currently does not checkpoint the state of the
>> spout. It checkpoints the states of all the bolts and once that’s
>> successful, the tuples emitted by the spout are acked. So currently it
>> provides at-least once guarantee.
>>
>> In the ack method of the spout, you can update your offsets.
>>
>> In future we will extend state checkpointing to checkpoint the state of
>> the spout.
>>
>> Thanks,
>> Arun
>>
>>
>> From: Olivier Mallassi
>> Reply-To: "user@storm.apache.org"
>> Date: Tuesday, May 17, 2016 at 5:29 PM
>> To: "user@storm.apache.org"
>> Subject: State Checkpointing & spout state
>>
>> Hello
>>
>> I would need to use the state checkpointing for recovery (btw, very
>> useful feature). I am facing an issue regarding how to checkpoint the state
>> of the my spout (no the checkpoint spout) as part of the "transaction".
>>
>> My Spout is reading from kafka (or equivalent) and so keeps an offset of
>> the last read events.
>> It keeps track of
>> - the last read offset
>> - the emitted and acknowledged events (with their associated offset).
>> - the emitted and unack events (so they can be replayed)
>>
>> With state checkpointing, the bolt states will be kept but how can I keep
>> the state of the source ? how can I ensure the spout replays events from
>> the offset that match the checkpoint (or txid)?
>> Is there any guarantees in storm that the acks are received in the order
>> they are sent?
>>
>> Cheers.
>>
>> olivier.
>>
>
>

Reply via email to