伍学明 created KAFKA-19805:
---------------------------

             Summary: Add acks dimension to BrokerTopicMetrics for produce 
requests
                 Key: KAFKA-19805
                 URL: https://issues.apache.org/jira/browse/KAFKA-19805
             Project: Kafka
          Issue Type: Improvement
          Components: core
    Affects Versions: 4.1.0
            Reporter: 伍学明
            Assignee: 伍学明
             Fix For: 4.1.0


h3. *Title*

Add {{acks}} dimension to {{BrokerTopicMetrics}} to measure per-acks produce 
performance impact
----
h3. *Summary*

Currently, Kafka’s {{BrokerTopicMetrics}} only tracks produce metrics such as 
{{{}BytesInPerSec{}}}, {{{}MessagesInPerSec{}}}, and {{ProduceRequestsPerSec}} 
at the topic level. However, these metrics do not distinguish between different 
producer acknowledgment ({{{}acks{}}}) configurations ({{{}acks=0{}}}, 
{{{}acks=1{}}}, {{{}acks=-1{}}}).
In high-throughput environments, different {{acks}} levels have significantly 
different impacts on broker CPU, I/O, and network utilization.
This proposal introduces an additional {{acks}} label to existing topic-level 
metrics, enabling more granular visibility into broker performance under 
various producer reliability modes.
----
h3. *Motivation*

The current aggregated produce metrics make it difficult to assess the 
performance and stability implications of different {{acks}} settings on 
brokers.
For example, asynchronous ({{{}acks=0{}}}) and fully acknowledged 
({{{}acks=-1{}}}) produces can have very different effects on disk I/O, request 
queues, and replication latency, but these effects are hidden in current 
metrics.

By introducing an {{acks}} dimension, operators and performance engineers can:
 * Quantify the resource cost of different producer acknowledgment strategies.

 * Analyze how {{acks}} configuration affects cluster throughput, replication 
load, and latency.

 * Perform fine-grained benchmarking and capacity planning.

----
h3. *Proposed Changes*
 # *Extend {{BrokerTopicStats}}*
Add a new {{perTopicAcksStats}} structure to track metrics per {{(topic, 
acks)}} combination:

{{val perTopicAcksStats = new Pool[(String, Short), BrokerTopicMetrics](
  Some((key) => new BrokerTopicMetrics(Some(s"${key._1},ack=${key._2}")))
)}}
 # *Instrument Produce Handling*
In {{{}KafkaApis.handleProduceRequest{}}}, extract the producer {{acks}} value 
and record metrics accordingly:

{{val ackVal = produceRequest.acks()
brokerTopicStats.topicStats(topic).bytesInRate.mark(bytes)
brokerTopicStats.topicAcksStats(topic, ackVal).bytesInRate.mark(bytes)}}
The same logic applies to:

 ** {{messagesInRate}}

 ** {{produceRequestsRate}}

 # *Automatic Metric Naming*
Since {{BrokerTopicMetrics}} extends {{{}KafkaMetricsGroup{}}}, the new label 
will automatically generate JMX metrics like:

{{kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=-1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=0}}
 # *Performance Considerations*

 ** {{perTopicAcksStats}} uses lazy initialization and caching via {{Pool}} to 
avoid excessive metric object creation.

 ** Expiration or cleanup logic can be added for inactive metrics.

----
h3. *Example Metrics Output*

 

{{kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=0
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=1
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=perf-test,ack=-1}}
----
h3. *Compatibility & Impact*
 * No breaking changes to existing metrics.

 * Existing metric names and topic-level aggregation remain unaffected.

 * New metrics are additive and optional.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to