[
https://issues.apache.org/jira/browse/KAFKA-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940149#comment-13940149
]
Jay Kreps commented on KAFKA-1251:
----------------------------------
I posted a draft patch. This patch adds a variety of metrics. I haven't changed
the histogram instrumentation so for now it is just avg, max, rate, etc. We add
that fairly easily.
Several things to discuss:
1. The list of metrics
2. The naming
3. Which metrics should be captured at the broker or topic level
4. Performance
5. JMX reporting
Okay the list of metrics is below, check it out. We can discuss the names and
doc strings for various metrics, perhaps they can be improved (if it isn't
clear what a metric does from the doc string then it can definitely be
improved!). Our goal should be that the doc strings fully document the metrics
so that we don't have to keep separate HTML docs up-to-date.
Currently I give each metric an un-namespaced name such as message-send-rate.
In the JMX I prefix everything with "kafka.producer." [+ clientId + "."] for
uniqueness. This means all the metrics below show up as attributes under the
same mbean (kafka.producer.<client-id>). I think this is a lot more
straight-forward to look at in jconsole and other tools.
Performance--there is really significant performance impact from metrics
(perhaps surprisingly). As a result I removed all the metrics from
KafkaProducer.send() and moved them into the background thread so that they are
all per batch or per request rather than per-message. At first I thought this
was some bad on my part, so I did some performance comparison against the
yammer metrics package. It is pretty similar. But basically if you do 500k
calls/sec the overhead adds up significantly. So if you are wondering why
things like maxMessageSize are calculated in a weird way that is why. Even
after that fix metrics performance is still a big deal, so I may see if I can
optimize a bit more in the metrics package.
My thought was to only break-out a few metrics per-topic or per-broker. I
haven't done that yet, so let's discuss what we want.
Per-topic:
message-send-rate, message-error-rate, message-retry-rate, bytes-per-second
Per-broker
message-send-rate, message-error-rate, message-retry-rate,
bytes-sent-per-second, bytes-received-per-second, requests-sent-per-second,
requests-received-per-second, request-latency
Here is the current list of metrics:
"message-error-rate", "The average number of errors per second returned to the
client."
"message-retry-rate", "The average per-second number of retries"
"message-send-rate", "The average number of messages sent per second."
"waiting-threads", "The number of user threads blocked waiting for buffer
memory to enqueue their records"
"buffer-total-bytes", "The maximum amount of buffer memory the client can use
(whether or not it is currently used)."
"buffer-available-bytes", "The total amount of buffer memory that is not being
used (either unallocated or in the free list)."
"ready-partitions", "The number of topic-partitions with buffered data that is
ready to be sent."
"batch-size-avg", "The average number of bytes per partition sent in requests."
"request-latency-avg", "The average request latency in ms"
"request-latency-max", "The maximum request latency in ms"
"messages-per-request-avg", "The average number of messages per request"
"message-size-max", "The maximum message size"
"requests-in-flight", "The current number of in-flight requests awaiting a
response."
"metadata-age", "The age in seconds of the current producer metadata being
used."
"network-ops-per-second", "The average number of network operations (reads or
writes) on all connections per second."
"bytes-sent-per-second", "The average number of outgoing bytes sent per second
to all servers."
"requests-sent-per-second", "The average number of requests sent per second."
"request-size-avg", "The average size of all requests in the window.."
"request-size-max", "The maximum size of any request sent in the window."
"bytes-received-per-second", "Bytes/second read off all sockets"
"responses-received-per-second", "Responses received sent per second."
"connections-created-per-second", "New connections established per second in
the window."
"connections-closed-per-second", "Connections closed per second in the window."
"select-calls-per-second", "Number of times the I/O layer checked for new I/O
to perform per second",
"select-time-avg-ns", "The average length of time per select call in
nanoseconds."
"select-percentage", "The fraction of time the I/O thread spent waiting."
"io-time-avg-ns", "The average length of time for I/O per select call in
nanoseconds."
"io-percentage", "The fraction of time spent doing I/O"
"connection-count", "The current number of active connections."
> Add metrics to the producer
> ---------------------------
>
> Key: KAFKA-1251
> URL: https://issues.apache.org/jira/browse/KAFKA-1251
> Project: Kafka
> Issue Type: Sub-task
> Components: producer
> Reporter: Jay Kreps
> Assignee: Jay Kreps
> Attachments: KAFKA-1251.patch
>
>
> Currently there are no metrics.
--
This message was sent by Atlassian JIRA
(v6.2#6252)