Hello all, I got the following exception in my Kafka consumer: 2019-03-06 10:53:34,416 WARN [LogContext.java:246] [Consumer clientId=ContractComputerConsumer_prod-8, groupId=consumer_2] Error while fetching metadata with correlation id 22904 : {topic1_prod=UNKNOWN_TOPIC_OR_PARTITION}
I checked, all my zookeeper instance is avaible and same state for my Kafka brokers. however I see for topic description: Topic:contract_prod PartitionCount:1 ReplicationFactor:3 Configs: Topic: contract_prod Partition: 0 Leader: 1 Replicas: 4,1,2 Isr: 1,2 the '4' replica should not be in Isr ? I can see that in Kafka '4' broker logs: [2019-03-07 13:38:11,486] WARN [Replica Manager on Broker 4]: While recording the replica LEO, the partition topic4_prod-0 hasn't been created. (kafka.server.ReplicaManager) [2019-03-07 13:38:11,486] WARN [Replica Manager on Broker 4]: While recording the replica LEO, the partition topic3_prod-0 hasn't been created. (kafka.server.ReplicaManager) [2019-03-07 13:38:11,486] WARN [Replica Manager on Broker 4]: While recording the replica LEO, the partition topic2_prod-0 hasn't been created. (kafka.server.ReplicaManager) [2019-03-07 13:38:11,464] ERROR [ReplicaFetcherThread-0-1], Error for partition [__consumer_offsets,5] to broker 1:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread) And several following error in other Kafka broker: [2019-03-07 13:39:55,826] ERROR [ReplicaFetcherThread-0-2], Error for partition [__consumer_offsets,46] to broker 2:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread) Also found the following log trace in : [2018-12-07 14:13:27,722] WARN [Controller-2-to-broker-2-send-thread], Controller 2's connection to broker broker2:9092 (id: 2 rack: null) was unsuccessful (kafka.controller.RequestSendThread) java.io.IOException: Connection to broker2:9092 (id: 2 rack: null) failed at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:8 Do you have an idea of the problem ? how can I restore the situation please? this is a production environment. Thank if anyone can help me please ... Best regards Adrien