伍学明 created KAFKA-19805:
---------------------------
Summary: Add acks dimension to BrokerTopicMetrics for produce
requests
Key: KAFKA-19805
URL: https://issues.apache.org/jira/browse/KAFKA-19805
Project: Kafka
Issue Type: Improvement
Components: core
Affects Versions: 4.1.0
Reporter: 伍学明
Assignee: 伍学明
Fix For: 4.1.0
h3. *Title*
Add {{acks}} dimension to {{BrokerTopicMetrics}} to measure per-acks produce
performance impact
----
h3. *Summary*
Currently, Kafka’s {{BrokerTopicMetrics}} only tracks produce metrics such as
{{{}BytesInPerSec{}}}, {{{}MessagesInPerSec{}}}, and {{ProduceRequestsPerSec}}
at the topic level. However, these metrics do not distinguish between different
producer acknowledgment ({{{}acks{}}}) configurations ({{{}acks=0{}}},
{{{}acks=1{}}}, {{{}acks=-1{}}}).
In high-throughput environments, different {{acks}} levels have significantly
different impacts on broker CPU, I/O, and network utilization.
This proposal introduces an additional {{acks}} label to existing topic-level
metrics, enabling more granular visibility into broker performance under
various producer reliability modes.
----
h3. *Motivation*
The current aggregated produce metrics make it difficult to assess the
performance and stability implications of different {{acks}} settings on
brokers.
For example, asynchronous ({{{}acks=0{}}}) and fully acknowledged
({{{}acks=-1{}}}) produces can have very different effects on disk I/O, request
queues, and replication latency, but these effects are hidden in current
metrics.
By introducing an {{acks}} dimension, operators and performance engineers can:
* Quantify the resource cost of different producer acknowledgment strategies.
* Analyze how {{acks}} configuration affects cluster throughput, replication
load, and latency.
* Perform fine-grained benchmarking and capacity planning.
----
h3. *Proposed Changes*
# *Extend {{BrokerTopicStats}}*
Add a new {{perTopicAcksStats}} structure to track metrics per {{(topic,
acks)}} combination:
{{val perTopicAcksStats = new Pool[(String, Short), BrokerTopicMetrics](
Some((key) => new BrokerTopicMetrics(Some(s"${key._1},ack=${key._2}")))
)}}
# *Instrument Produce Handling*
In {{{}KafkaApis.handleProduceRequest{}}}, extract the producer {{acks}} value
and record metrics accordingly:
{{val ackVal = produceRequest.acks()
brokerTopicStats.topicStats(topic).bytesInRate.mark(bytes)
brokerTopicStats.topicAcksStats(topic, ackVal).bytesInRate.mark(bytes)}}
The same logic applies to:
** {{messagesInRate}}
** {{produceRequestsRate}}
# *Automatic Metric Naming*
Since {{BrokerTopicMetrics}} extends {{{}KafkaMetricsGroup{}}}, the new label
will automatically generate JMX metrics like:
{{kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=-1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=0}}
# *Performance Considerations*
** {{perTopicAcksStats}} uses lazy initialization and caching via {{Pool}} to
avoid excessive metric object creation.
** Expiration or cleanup logic can be added for inactive metrics.
----
h3. *Example Metrics Output*
{{kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=0
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=-1}}
----
h3. *Compatibility & Impact*
* No breaking changes to existing metrics.
* Existing metric names and topic-level aggregation remain unaffected.
* New metrics are additive and optional.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)