Hello Kafka Community. I would like to start a discussion on KIP-1344, which proposes adding a few broker-side metrics to improve observability of the transaction.
KIP- https://cwiki.apache.org/confluence/display/KAFKA/KIP-1344%3A+Add+Transaction+Coordinator+Metrics+for+Transaction+State+Log+Appends+and+Marker+Completion Today, when transactions are slow to complete, it is difficult to distinguish whether the bottleneck is appending transaction state transitions to __transaction_state, sending transaction markers to data partition leaders, or retrying final transaction state log appends. This KIP proposes adding metrics for: - transaction state log append latency - transaction state log append errors - transactions waiting for marker completion - retriable partition-level marker failures. The goal is to make transaction finalization issues easier to diagnose without adding per-transactional-id metrics or scan-based gauges. Looking forward to community's feedback! Best regards, Sanghyeok An.
