Re: Delay in CHKPT message for stateful task

Arun Mahadevan Tue, 28 Mar 2017 03:52:05 -0700

>1-  I am unable to get how it guarantees that  "the state saved at A and B 
>represents the state that’s the result of all the tuples that arrived from the 
>spout before C1" because ordering is not guaranteed while transfer from 
>upstream bolt to downstream bolt.

Ordering is preserved between a pair of tasks. If a task emits tuples and then 
emits checkpoint to another task (via the checkpoint stream) the destination 
task receives it in the same order. Check this thread - 
https://groups.google.com/forum/?fromgroups=#!searchin/storm-user/tuple$20order/storm-user/4ptD6NG3RIY/BPuyDOK7sU4J

>2-  I have one more query that Is there any way to decide the time difference 
>between PREPARE and COMMIT action in the $CHKPT message. As per my 
>understanding, for now, it depends on the call to next tuple logic.(Please 
>correct me if I am wrong.)

I am not sure if I understand your requirement correctly. You can only set the 
interval at which the check points are taken (via 
topology.state.checkpoint.interval.ms). The commit is triggered once the 
prepare is successful (i.e. all tasks have successfully prepared its state for 
committing).

>3- Also does the checkpoint spout have allgrouping so that every task of 
>downstream one get a copy of  $CHKPT message. (nothing like that is observed 
>in topologyBuilder code).

Yes, we use all grouping to send tuples via the checkpoint stream. Here’s the 
piece of code 
https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/topology/TopologyBuilder.java#L425

Arun

From: anshu shukla <anshushuk...@gmail.com>
Reply-To: "user@storm.apache.org" <user@storm.apache.org>
Date: Tuesday, March 28, 2017 at 3:14 PM
To: "user@storm.apache.org" <user@storm.apache.org>
Subject: Re: Delay in CHKPT message for stateful task

Hello Arun, 

Thanks for the nice explanation. But I have little doubts:

1-  I am unable to get how it guarantees that  "the state saved at A and B 
represents the state that’s the result of all the tuples that arrived from the 
spout before C1" because ordering is not guaranteed while transfer from 
upstream bolt to downstream bolt.

Also, I am unable to find the code logic that confirms the above logic ( in 
CheckpointSpout and CheckpointTupleForwarder code)

2-  I have one more query that Is there any way to decide the time difference 
between PREPARE and COMMIT action in the $CHKPT message. As per my 
understanding, for now, it depends on the call to next tuple logic.(Please 
correct me if I am wrong.)

3- Also does the checkpoint spout have allgrouping so that every task of 
downstream one get a copy of  $CHKPT message. (nothing like that is observed in 
topologyBuilder code).

On Sat, Mar 25, 2017 at 2:42 PM, Arun Mahadevan <ar...@apache.org> wrote:

The checkpoint tuples have to go through the same queue and follow the tuples 
emitted before it to make the state consistent across the bolts.

When bolt ‘A’ receives a checkpoint (say C1 from the spout), it saves its state 
(of processing the tuples up to C1) and emits ’C1’ to the next bolt say Bolt B. 
Now bolt B should process all the previous tuples emitted by A first before it 
processes ‘C1’ so that the sate saved at A and B represents the state that’s 
the result of all the the tuples that arrived from the spout before C1.

Thanks,

Arun

From: anshu shukla <anshushuk...@gmail.com>
Reply-To: "user@storm.apache.org" <user@storm.apache.org>
Date: Saturday, March 25, 2017 at 1:39 PM
To: "user@storm.apache.org" <user@storm.apache.org>
Subject: Delay in CHKPT message for stateful task

Hello , 

I was worrying that since CHKPT messages go thru the same queues as the actual 
tuples and in the case of large topology with many in/out queues(disruptor)  it 
will take a long time for CHKPT tuple to reach the last stateful bolt in the 
topology.

So Is there any way to give priority to the CHKPT msg so that it passes on 
fastly. So that even in case of congestion we can have safe CHKPTing.

-- 

Thanks & Regards,
Anshu Shukla

-- 

Thanks & Regards,
Anshu Shukla

Re: Delay in CHKPT message for stateful task

Reply via email to