[ https://issues.apache.org/jira/browse/KAFKA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-203: -------------------------- Attachment: kafka-203_v2.patch Attache patch v2. Overview of additional changes. 1. rebased. 2. DefaultEventHandler: Added a metric for serlializationErrorRate. Also changed the serialization error handling a bit depending on whether the send is sync or async. 3. Use the following convention to distinguish AllTopic metrics data and the per topic one: for metrics X, AllTopics => AllTopicsX; topic Y => Y-X. 4. Made a pass of metrics names and tried to keep them consistent. 5. Updated metrics.json with the new metrics. Right now we start a separate jmx tool for each jmx bean. Too many beans will slow down the system test. So, I only exposed a subset of the metrics. Once the jmx tool issue is resolved, we can add more beans for collection and graphing. Review comments: Joel: 4) Consumer lag is the most useful metrics, on which an alert can be set. LogEndOffset and ConsumerOffset are less useful and can be obtained from tools. 5) The problem is that we only know the size of a message after it is serialized, since message itself can be of any type. 8) If a follower is slow, it will be eventually dropped out of ISR and we have metrics on both ISRShrink rate and underreplicated partitions to track this. Neha: 1. For that, the broker needs to know the replication factor of a partition. This information needs to be sent from broker on LeaderAndISRRequest. I will leave that in kafka-340. 4.3 LeaderCount is useful to see if client loads are balanced among brokers and a global "under replicated partition count" is convenient for setting up a alert (otherwise, one has to do that on each partition). The rest of the review comments are all addressed. > Improve Kafka internal metrics > ------------------------------ > > Key: KAFKA-203 > URL: https://issues.apache.org/jira/browse/KAFKA-203 > Project: Kafka > Issue Type: New Feature > Components: core > Affects Versions: 0.8 > Reporter: Jay Kreps > Assignee: Jun Rao > Labels: tools > Attachments: kafka-203_v1.patch, kafka-203_v2.patch > > > Currently metrics in kafka are using old-school JMX directly. This makes > adding metrics a pain. It would be good to do one of the following: > 1. Convert to Coda Hale's metrics package > (https://github.com/codahale/metrics) > 2. Write a simple metrics package > The new metrics package should make metrics easier to add and work with and > package up the common logic of keeping windowed gauges, histograms, counters, > etc. JMX should be just one output of this. > The advantage of the Coda Hale package is that it exists so we don't need to > write it. The downsides are (1) introduces another client dependency which > causes conflicts, and (2) seems a bit heavy on design. The good news is that > the metrics-core package doesn't seem to bring in a lot of dependencies which > is nice, though the scala wrapper seems to want scala 2.9. I am also a little > skeptical of the approach for histograms--it does sampling instead of > bucketing though that may be okay. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira