Hello, we have been running a 4 broker cluster for more than a year. Over the past few months one of our brokers tends to slip into a state where it is no longer a leader for any partition. The frequency at which this has started to happen is alarming.
- We've been on version 2.10-0.8.2.1 for a year and 2 months now - We run the preferred leader election when this drifts away, then it comes back to normal. Then it happens again - Even in the "not leader" state, the broker is in the ISR for some partitions - We've even restarted the broker on some occasions Has anyone seen this happen on the version we are on? We did not have any issues like this for a very long time. Message: [ERROR] [ReplicaFetcherThread-3-3] [kafka.server.ReplicaFetcherThread] [ReplicaFetcherThread-3-3], Error for partition [prd-xxxxxx,19] to broker 3:class kafka.common.NotLeaderForPartitionException Thanks, Ashwin Jayaprakash.