Hi Jose, Thanks for the KIP.
The approach sounds reasonable. By the way, I think one of the gaps we have today is when the leader gets partitioned from the remaining voters. I believe it continues acting as a leader indefinitely. I was considering whether this periodic write can address the issue. Basically it can be used to force a leader to prove it is still the leader by committing some data. Say, for example, that the leader fails to commit the record after the fetch timeout expires, then perhaps it could start a new election. What do you think? A couple additional questions: - What is the default value for `metadata.monitor.write.interval.ms`? Also, I'm wondering if `controller` would be a more suitable prefix? - Could we avoid letting BrokerMetadataPublisher escape into the metric name? Letting the classnames leak into the metrics tends to cause compatibility issues over time. Best, Jason On Fri, May 6, 2022 at 12:02 PM José Armando García Sancio <jsan...@confluent.io.invalid> wrote: > Hi all, > > I created a KIP for adding a mechanism to monitor the health of the > KRaft Controller quorum through metrics. See KIP-835: > https://cwiki.apache.org/confluence/x/0xShD > > Thanks for your feedback, > -José >