Github user abhishekagarwal87 commented on the pull request:
https://github.com/apache/storm/pull/1406#issuecomment-217775942
Regarding A - That's a good point and glad to know that this issue is being
solved for. The current executor stats takes up good amount of space (0.5 MB in
my topologies) I earlier ran into an issue where I had packed too many
executors into single worker and zookeeper writing failed. In fact, I was quite
surprised to know that such a heavy write is being done into zookeeper :) This
PR can wait if we want to solve that problem first.
Regarding B - Having queue depth can give user's very good visibility.
Though I think it would have been more useful when backpressure was not
available. Now if you look at the code, the (queue + overflow buffer) is
actually unbounded. Previously, a slow bolt would stall the whole topology and
queue depth would have helped in zeroing down on that bolt.
complete latency - I haven't gone through the discussion closely so I
can't comment right now.
sojournTime - Looks like a good metric. I missed it completely. Though it
still remains to be shown on UI.
So to summarize, there are four major points -
1. A more efficient way to periodically update executor stats (Metric
Producer)
2. Zeroing on which queue metrics are useful
3. Packing queue metrics in the executor stats (simple enough)
4. Move nimbus to read the metrics for new metric store or some other place
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---