Thank you for your reply Stig, it was very helpful!

Thomas Cooper
PhD Student
Newcastle University, School of Computer Science
W: http://www.tomcooper.org.uk | Twitter: 
@tomncooper<https://twitter.com/tomncooper>
________________________________
From: Stig Rohde Døssing <s...@apache.org>
Sent: 15 October 2017 19:50:03
To: user@storm.apache.org
Subject: Re: Finding which acker executor a bolt acks to

Hi,

The ackers listen on a few streams to receive "start tracking tuple tree", "ack 
tuple" and "fail tuple" messages.

The bolts and spouts are hooked up to emit to these streams here 
https://github.com/apache/storm/blob/05a74c73db9b9017d3672d3cc055e8f1ada1d69e/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L242.

Messages are grouped by the "id" field, which is declared here 
https://github.com/apache/storm/blob/05a74c73db9b9017d3672d3cc055e8f1ada1d69e/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L300
 and used here 
https://github.com/apache/storm/blob/90ca7fa0c8e73a1884c70e2d3da3388b24d13db0/storm-client/src/jvm/org/apache/storm/executor/spout/SpoutOutputCollectorImpl.java#L133
 (similar for bolts). The ID in question is a random number generated for the 
tuple tree, ensuring that all acks/fails for the tuple tree go to the same 
acker that received the init message.

It doesn't look to me like we try to keep acks local to the worker, but I 
didn't look too hard. I hope this at least gives you an idea where you might 
start looking.

2017-10-10 19:32 GMT+02:00 Thomas Cooper (PGR) 
<t.coo...@newcastle.ac.uk<mailto:t.coo...@newcastle.ac.uk>>:

Hi,


I am modelling the performance of storm topologies. To model complete latency I 
need to account for the latency from the final "sink" component in the topology 
to the Acker (where the clock is stopped on the CL measurement).


I have a way to track tuple routing within a topology, however tuples sent to 
system component are not covered by this.


I was wondering is someone could explain how the acker is connected to the 
topology components (I am having trouble identifying where in the code base 
this takes place) and what type of grouping they use?


Is there any intelligence to this routing, i.e. will an executor always ack to 
an Acker on the same WP and if not then at least a WP on the same Host, or is 
it more like a shuffle?


Any help with the above would be appreciated. I really want to avoid 
re-implementing the Acker bolt myself.


Cheers,


Thomas Cooper
PhD Student
Newcastle University, School of Computer Science
W: http://www.tomcooper.org.uk | Twitter: 
@tomncooper<https://twitter.com/tomncooper>

Reply via email to