[ https://issues.apache.org/jira/browse/ZOOKEEPER-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551364#comment-16551364 ]
Michael Han commented on ZOOKEEPER-3098: ---------------------------------------- [~eolivelli] My thought is we still need a metric interface to hook with external reporters - and also we need metrics type definition more than counter which is the only type presented in the patch. ZOOKEEPER-3092 is more about the general metrics framework infrastructure and the work in this Jira is more about actual instrumentation and metrics collection. > Add additional server metrics > ----------------------------- > > Key: ZOOKEEPER-3098 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3098 > Project: ZooKeeper > Issue Type: Improvement > Components: server > Affects Versions: 3.6.0 > Reporter: Joseph Blomstedt > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This patch adds several new server-side metrics as well as makes it easier to > add new metrics in the future. This patch also includes a handful of other > minor metrics-related changes. > Here's a high-level summary of the changes. > # This patch extends the request latency tracked in {{ServerStats}} to track > {{read}} and {{update}} latency separately. Updates are any request that must > be voted on and can change data, reads are all requests that can be handled > locally and don't change data. > # This patch adds the {{ServerMetrics}} logic and the related > {{AvgMinMaxCounter}} and {{SimpleCounter}} classes. This code is designed to > make it incredibly easy to add new metrics. To add a new metric you just add > one line to {{ServerMetrics}} and then directly reference that new metric > anywhere in the code base. The {{ServerMetrics}} logic handles creating the > metric, properly adding the metric to the JSON output of the {{/monitor}} > admin command, and properly resetting the metric when necessary. The > motivation behind {{ServerMetrics}} is to make things easy enough that it > encourages new metrics to be added liberally. Lack of in-depth > metrics/visibility is a long-standing ZooKeeper weakness. At Facebook, most > of our internal changes build on {{ServerMetrics}} and we have nearly 100 > internal metrics at this time – all of which we'll be upstreaming in the > coming months as we publish more internal patches. > # This patch adds 20 new metrics, 14 which are handled by {{ServerMetrics}}. > # This patch replaces some uses of {{synchronized}} in {{ServerStats}} with > atomic operations. > Here's a list of new metrics added in this patch: > - {{uptime}}: time that a peer has been in a stable > leading/following/observing state > - {{leader_uptime}}: uptime for peer in leading state > - {{global_sessions}}: count of global sessions > - {{local_sessions}}: count of local sessions > - {{quorum_size}}: configured ensemble size > - {{synced_observers}}: similar to existing `synced_followers` but for > observers > - {{fsynctime}}: time to fsync transaction log (avg/min/max) > - {{snapshottime}}: time to write a snapshot (avg/min/max) > - {{dbinittime}}: time to reload database – read snapshot + apply > transactions (avg/min/max) > - {{readlatency}}: read request latency (avg/min/max) > - {{updatelatency}}: update request latency (avg/min/max) > - {{propagation_latency}}: end-to-end latency for updates, from proposal on > leader to committed-to-datatree on a given host (avg/min/max) > - {{follower_sync_time}}: time for follower to sync with leader (avg/min/max) > - {{election_time}}: time between entering and leaving election (avg/min/max) > - {{looking_count}}: number of transitions into looking state > - {{diff_count}}: number of diff syncs performed > - {{snap_count}}: number of snap syncs performed > - {{commit_count}}: number of commits performed on leader > - {{connection_request_count}}: number of incoming client connection requests > - {{bytes_received_count}}: similar to existing `packets_received` but > tracks bytes -- This message was sent by Atlassian JIRA (v7.6.3#76005)