Hi All, I have a 8 node Kafka cluster (broker.id - 1..8). On this cluster I have a topic "wordcount", which was 8 partitions with a replication factor of 3.
So a describe of topic wordcount # bin/kafka-topics.sh --describe --zookeeper tr-pan-hclstr-08.amers1b.ciscloud:2181/kafka/kafka-clstr-01 --topic wordcount Topic:wordcount PartitionCount:8 ReplicationFactor:3 Configs: Topic: wordcount Partition: 0 Leader: 6 Replicas: 7,6,8 Isr: 6,7,8 Topic: wordcount Partition: 1 Leader: 7 Replicas: 8,7,1 Isr: 7 Topic: wordcount Partition: 2 Leader: 8 Replicas: 1,8,2 Isr: 8 Topic: wordcount Partition: 3 Leader: 3 Replicas: 2,1,3 Isr: 3 Topic: wordcount Partition: 4 Leader: 3 Replicas: 3,2,4 Isr: 3,2,4 Topic: wordcount Partition: 5 Leader: 3 Replicas: 4,3,5 Isr: 3,5 Topic: wordcount Partition: 6 Leader: 6 Replicas: 5,4,6 Isr: 6,5 Topic: wordcount Partition: 7 Leader: 6 Replicas: 6,5,7 Isr: 6,5,7 I wrote a simple producer to write to this topic. However when running I get these messages in the logs - 2014-10-13 19:02:32,459 INFO [main] kafka.client.ClientUtils$: Fetching metadata from broker id:0,host:tr-pan-hclstr-11.amers1b.ciscloud,port:9092 with correlation id 0 for 1 topic(s) Set(wordcount) 2014-10-13 19:02:32,464 INFO [main] kafka.producer.SyncProducer: Connected to tr-pan-hclstr-11.amers1b.ciscloud:9092 for producing 2014-10-13 19:02:32,551 INFO [main] kafka.producer.SyncProducer: Disconnecting from tr-pan-hclstr-11.amers1b.ciscloud:9092 2014-10-13 19:02:32,611 WARN [main] kafka.producer.BrokerPartitionInfo: Error while fetching metadata partition 4 leader: none replicas: 3 (tr-pan-hclstr-13.amers1b.ciscloud:9092),2 (tr-pan-hclstr-12.amers1b.ciscloud:9092),4 (tr-pan-hclstr-14.amers1b.ciscloud:9092) isr: isUnderReplicated: true for topic partition [wordcount,4]: [class kafka.common.LeaderNotAvailableException] 2014-10-13 19:02:33,505 INFO [main] kafka.producer.SyncProducer: Connected to tr-pan-hclstr-15.amers1b.ciscloud:9092 for producing 2014-10-13 19:02:33,543 WARN [main] kafka.producer.async.DefaultEventHandler: Produce request with correlation id 20611 failed due to [wordcount,5]: kafka.common.NotLeaderForPartitionException,[wordcount,6]: kafka.common.NotLeaderForPartitionException,[wordcount,7]: kafka.common.NotLeaderForPartitionException 2014-10-13 19:02:33,694 INFO [main] kafka.producer.SyncProducer: Connected to tr-pan-hclstr-18.amers1b.ciscloud:9092 for producing 2014-10-13 19:02:33,725 WARN [main] kafka.producer.async.DefaultEventHandler: Produce request with correlation id 20612 failed due to [wordcount,0]: kafka.common.NotLeaderForPartitionException 2014-10-13 19:02:33,861 INFO [main] kafka.producer.SyncProducer: Connected to tr-pan-hclstr-11.amers1b.ciscloud:9092 for producing 2014-10-13 19:02:33,983 WARN [main] kafka.producer.async.DefaultEventHandler: Failed to send data since partitions [wordcount,4] don't have a leader Obviously something is terribly wrong... I am quite new to Kafka, hence these messages don't make any sense to me, except for the fact that it is telling me that some of the partitions don't have any leader. Could somebody be kind enough to explain the above message? A few more questions - (1) How does one get into this state? (2) How can I get out of this state? (3) I have set auto.leader.rebalance.enable=true on all brokers. Shouldn't the partitions be balanced across all the brokers? (4) I can see that the Kafka service are running on all 8 nodes. (I used ps ax -o "pid pgid args" and I can see under the kafka Java process). (5) Is there a way I can force a re-balance? Regards, Jacob -- ~