I highly recommend these videos : Learning Storm:
- https://www.youtube.com/watch?v=iLZrYPbNypg - InfoChimps - uses storm quite a bit - covers storm networking / buffering - https://www.youtube.com/watch?v=bdps8tE0gYo - This is DEFN a must see for any new Storm user - Nathan Martz on Storm - http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/ - internal message buffering - http://demo.ooyala.com/player.html?width=640&height=360&embedCode=Q1eXg5NzpKqUUzBm5WTIb6bXuiWHrRMi&videoPcode=9waHc6zKpbJKt9byfS7l4O4sn7Qn - Nathan Mars - father of storm - storm performance tuning - audio isn’t great. Turn it up and be patient, it gets better Thank you for your time! +++++++++++++++++++++ Jeff Maass <maas...@gmail.com> linkedin.com/in/jeffmaass stackoverflow.com/users/373418/maassql +++++++++++++++++++++ On Thu, May 21, 2015 at 7:53 AM, Johannes Hugo Kitschke < johannes.kitsc...@rwth-aachen.de> wrote: > Thanks. That explains stuff... Some follow up questions: > > Is there a way to be made aware of the following: > - (when) does throttling occur(max pending reached)? > - how to tell if the topology runs in realtime? One could look if > 'Executed' and 'Acked' are balanced but only during runtime and only > meaningful if no throtteling occurs > > Maybe the execute latency is what I'm looking for since in my case the > work for each tuple is done once the tuple is processed (no caching). > So to recap: The processing latency is the time till the tuple is acked. > This might be considerably larger then the execute latency, since e.g. > acking might be slow (!?). Are there other reasons? > > Is the following thought valid: Say a component has an execute latency of > 1ms, I know each second 500 new tuples arrive. This is safe since 500*1ms = > .5s < 1s . > I could then increase the input flow at most to 1000 new tuples each > second before risking for sure that the component falls behind and a queue > develops. > > J. > > > On 05/21/2015 02:21 PM, Matthias J. Sax wrote: > >> Hi, >> >> Storm processes multiple tuple in an "overlapping" manner, ie, emitting >> from spout, network transfer, processing at bolt is fully pipelined and >> multiple tuples are in the pipeline at the same time. Additionally, this >> pipeline contains multiple buffers and thread transferring the tuple >> from buffer to buffer (or over the network). Thus, the measured >> latencies are "overlapping" and you cannot simple sum them up. >> >> -Matthias >> >> >> On 05/21/2015 02:08 PM, Johannes Hugo Kitschke wrote: >> >>> Hi, >>> >>> I don't get the following: The UI reports a process latency of 152ms, a >>> total of 835200 tuples are acked / processed. >>> Why can't I compute the total runtime with # of acked * process latency? >>> The value is far too large, the topology was only running for 10 Minutes >>> or so. >>> 835200 * 152ms = 126950 seconds!? Where is my mistake? The bolt runs as >>> a single task. >>> >>> >>> >>> Id Executors Tasks Emitted Transferred Capacity >>> (last 10m) Execute >>> latency (ms) Executed Process latency (ms) Acked Failed >>> Last error >>> processor_15 >>> < >>> http://localhost:8080/component.html?id=processor_15&topology_id=b4-29-1432208535 >>> > >>> 1 1 417600 417600 0.006 0.007 835200 152.308 >>> 835200 0 >>> >>> >>> Best, >>> J. >>> >> > > -- > Johannes Hugo Kitschke > Am Bollet 6 > D-52078 Aachen > Mobil: 0177 / 3233 941 > >