[
https://issues.apache.org/jira/browse/KAFKA-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803057#comment-13803057
]
Joel Koshy commented on KAFKA-1100:
-----------------------------------
That's a good point - we don't need it to be that way. The metric names that
you referred to are derived from the consumer's registration in zookeeper.
There are a couple of cleanup tasks we need to do for mbeans especially wrt
consumers:
- The names need not include timestamps. The reason we have timestamps and a
hash in there is if you were to bring up two consumers under the same group on
the same host at nearly the same time their registration would collide in
zookeeper. Realistically this is something that only happens in system tests so
it should be fine to drop the timestamp and hash for metrics registration.
- Metrics are not de-registered on a rebalance/shutdown. I think there is
already a jira for the shutdown case, but I'm compiling a list of other
shortcomings and will file an umbrella jira to cover most of these issues.
- I think the deregistration issues affect replica fetchers as well (need to
check). i.e., if a broker transitions from a follower
to leader for a partition the follower metrics for that partition need to be
de-registered.
> metrics shouldn't have generation/timestamp specific names
> ----------------------------------------------------------
>
> Key: KAFKA-1100
> URL: https://issues.apache.org/jira/browse/KAFKA-1100
> Project: Kafka
> Issue Type: Bug
> Reporter: Jason Rosenberg
>
> I've noticed that there are several metrics that seem useful for monitoring
> overtime, but which contain generational timestamps in the metric name.
> We are using yammer metrics libraries to send metrics data in a background
> thread every 10 seconds (to kafka actually), and then they eventually end up
> in a metrics database (graphite, opentsdb). The metrics then get graphed via
> UI, and we can see metrics going way back, etc.
> Unfortunately, many of the metrics coming from kafka seem to have metric
> names that change any time the server or consumer is restarted, which makes
> it hard to easily create graphs over long periods of time (spanning app
> restarts).
> For example:
> names like:
> kafka.consumer.FetchRequestAndResponseMetrics....square-1371718712833-e9bb4d10-0-508818741-AllBrokersFetchRequestRateAndTimeMs
> or:
> kafka.consumer.ZookeeperConsumerConnector...topicName.....square-1373476779391-78aa2e83-0-FetchQueueSize
> In our staging environment, we have our servers on regular auto-deploy cycles
> (they restart every few hours). So just not longitudinally usable to have
> metric names constantly changing like this.
> Is there something that can easily be done? Is it really necessary to have
> so much cryptic info in the metric name?
--
This message was sent by Atlassian JIRA
(v6.1#6144)