Github user HeartSaVioR commented on the pull request:

    https://github.com/apache/storm/pull/1406#issuecomment-217765582
  
    @abhishekagarwal87 
    Thanks for the improvement. :)
    
    Btw, I've some opinions on this change.
    
    1. some concerns about adding payloads for task heartbeat
    
    As you may know, why Storm needs Pacemaker daemon with large cluster is 
that Storm includes task metrics into heartbeat message and store to ZK in a 
short interval (task.heartbeat.frequency.secs, its default value is 3) which is 
a big pressure for ZK.
    
    So we would like to have some discussions for expanding heartbeat message 
with current way, or change the way to send metrics to Nimbus (like JStorm). If 
we can make some more spaces for metrics, we can have ideations around metrics 
and add them to enrich. For example, spout tasks can have optional metrics, for 
example, partition information and lag for KafkaSpout.
    
    2. metrics for queue
    
    I guess sojourn time for the queue is one of most wanted feature of queue 
metrics, since many users said that they see very short latencies for 
execute/process latency for each task but also see very high complete latency.
    (@wangli1426 addresses sojourn time for disruptor queue but [as he stated 
to code 
comment](https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/utils/DisruptorQueue.java#L324),
 it's based on precondition which is sometimes not true for problematic task. 
If we can make it stable it would be really helpful.)
    
    STORM-1742 covers the accuracy of 'complete latency', but many parts of 
lifecycle of tuple are still hidden, for example, avg. of queue sojourn time, 
serde latency, transfer latency, etc. I think we don't want to address the 
things which can affect overall performance in order to measure, but they're 
meaningful information indeed so I would like to address if they don't hurt at 
all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to