[ 
https://issues.apache.org/jira/browse/STORM-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045442#comment-16045442
 ] 

Stig Rohde Døssing commented on STORM-2359:
-------------------------------------------

I think the reason it would be tough to move timeouts entirely to the ackers is 
that we'd need to figure out how to deal with message loss between the spout 
and acker when the acker sends a timeout message to the spout. The current 
implementation errs on the side of caution by always reemitting if it can't 
positively say that a tuple has been acked. I'm not sure how we could do the 
same when the acker has to notify the spout to reemit after the timeout, 
because that message could be lost.

It might be a good idea as you mention to instead have two timeouts, a short 
one for the acker and a much longer one for the spout. It would probably mean 
that messages where the acker init message is lost will take much longer to 
retry than messages that are lost elsewhere, but it might allow us to keep 
timeout resets out of the spout.

Tuples can be lost if a worker died, but what if there's a network issue? Can't 
messages also be lost then? 

> Revising Message Timeouts
> -------------------------
>
>                 Key: STORM-2359
>                 URL: https://issues.apache.org/jira/browse/STORM-2359
>             Project: Apache Storm
>          Issue Type: Sub-task
>          Components: storm-core
>    Affects Versions: 2.0.0
>            Reporter: Roshan Naik
>
> A revised strategy for message timeouts is proposed here.
> Design Doc:
>  
> https://docs.google.com/document/d/1am1kO7Wmf17U_Vz5_uyBB2OuSsc4TZQWRvbRhX52n5w/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to