Also, which version of Kafka are you using? Thanks,
Jun On Tue, Oct 14, 2014 at 5:31 PM, Jun Rao <jun...@gmail.com> wrote: > The following is a bit weird. It indicates no leader for partition 4, > which is inconsistent with what describe-topic shows. > > 2014-10-13 19:02:32,611 WARN [main] kafka.producer.BrokerPartitionInfo: > Error while fetching metadata partition 4 leader: none replicas: 3 > (tr-pan-hclstr-13.amers1b.ciscloud:9092),2 > (tr-pan-hclstr-12.amers1b.ciscloud:9092),4 > (tr-pan-hclstr-14.amers1b.ciscloud:9092) isr: isUnderReplicated: > true for topic partition [wordcount,4]: [class > kafka.common.LeaderNotAvailableException] > > Any error in the controller and the state-change log? Do you see broker 3 > marked as dead in the controller log? Also, could you check if the broker > registration in ZK ( > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper) > has the correct host/port? > > Thanks, > > Jun > > On Mon, Oct 13, 2014 at 5:35 PM, Abraham Jacob <abe.jac...@gmail.com> > wrote: > >> Hi All, >> >> I have a 8 node Kafka cluster (broker.id - 1..8). On this cluster I have >> a >> topic "wordcount", which was 8 partitions with a replication factor of 3. >> >> So a describe of topic wordcount >> # bin/kafka-topics.sh --describe --zookeeper >> tr-pan-hclstr-08.amers1b.ciscloud:2181/kafka/kafka-clstr-01 --topic >> wordcount >> >> >> Topic:wordcount PartitionCount:8 ReplicationFactor:3 Configs: >> Topic: wordcount Partition: 0 Leader: 6 Replicas: 7,6,8 >> Isr: 6,7,8 >> Topic: wordcount Partition: 1 Leader: 7 Replicas: 8,7,1 >> Isr: 7 >> Topic: wordcount Partition: 2 Leader: 8 Replicas: 1,8,2 >> Isr: 8 >> Topic: wordcount Partition: 3 Leader: 3 Replicas: 2,1,3 >> Isr: 3 >> Topic: wordcount Partition: 4 Leader: 3 Replicas: 3,2,4 >> Isr: 3,2,4 >> Topic: wordcount Partition: 5 Leader: 3 Replicas: 4,3,5 >> Isr: 3,5 >> Topic: wordcount Partition: 6 Leader: 6 Replicas: 5,4,6 >> Isr: 6,5 >> Topic: wordcount Partition: 7 Leader: 6 Replicas: 6,5,7 >> Isr: 6,5,7 >> >> I wrote a simple producer to write to this topic. However when running I >> get these messages in the logs - >> >> 2014-10-13 19:02:32,459 INFO [main] kafka.client.ClientUtils$: Fetching >> metadata from broker id:0,host:tr-pan-hclstr-11.amers1b.ciscloud,port:9092 >> with correlation id 0 for 1 topic(s) Set(wordcount) >> 2014-10-13 19:02:32,464 INFO [main] kafka.producer.SyncProducer: Connected >> to tr-pan-hclstr-11.amers1b.ciscloud:9092 for producing >> 2014-10-13 19:02:32,551 INFO [main] kafka.producer.SyncProducer: >> Disconnecting from tr-pan-hclstr-11.amers1b.ciscloud:9092 >> 2014-10-13 19:02:32,611 WARN [main] kafka.producer.BrokerPartitionInfo: >> Error while fetching metadata partition 4 leader: none replicas: >> 3 >> (tr-pan-hclstr-13.amers1b.ciscloud:9092),2 >> (tr-pan-hclstr-12.amers1b.ciscloud:9092),4 >> (tr-pan-hclstr-14.amers1b.ciscloud:9092) isr: isUnderReplicated: >> true for topic partition [wordcount,4]: [class >> kafka.common.LeaderNotAvailableException] >> 2014-10-13 19:02:33,505 INFO [main] kafka.producer.SyncProducer: Connected >> to tr-pan-hclstr-15.amers1b.ciscloud:9092 for producing >> 2014-10-13 19:02:33,543 WARN [main] >> kafka.producer.async.DefaultEventHandler: Produce request with correlation >> id 20611 failed due to [wordcount,5]: >> kafka.common.NotLeaderForPartitionException,[wordcount,6]: >> kafka.common.NotLeaderForPartitionException,[wordcount,7]: >> kafka.common.NotLeaderForPartitionException >> 2014-10-13 19:02:33,694 INFO [main] kafka.producer.SyncProducer: Connected >> to tr-pan-hclstr-18.amers1b.ciscloud:9092 for producing >> 2014-10-13 19:02:33,725 WARN [main] >> kafka.producer.async.DefaultEventHandler: Produce request with correlation >> id 20612 failed due to [wordcount,0]: >> kafka.common.NotLeaderForPartitionException >> 2014-10-13 19:02:33,861 INFO [main] kafka.producer.SyncProducer: Connected >> to tr-pan-hclstr-11.amers1b.ciscloud:9092 for producing >> 2014-10-13 19:02:33,983 WARN [main] >> kafka.producer.async.DefaultEventHandler: Failed to send data since >> partitions [wordcount,4] don't have a leader >> >> >> Obviously something is terribly wrong... I am quite new to Kafka, hence >> these messages don't make any sense to me, except for the fact that it is >> telling me that some of the partitions don't have any leader. >> >> Could somebody be kind enough to explain the above message? >> >> A few more questions - >> >> (1) How does one get into this state? >> (2) How can I get out of this state? >> (3) I have set auto.leader.rebalance.enable=true on all brokers. Shouldn't >> the partitions be balanced across all the brokers? >> (4) I can see that the Kafka service are running on all 8 nodes. (I used >> ps ax -o "pid pgid args" and I can see under the kafka Java process). >> (5) Is there a way I can force a re-balance? >> >> >> >> Regards, >> Jacob >> >> >> >> -- >> ~ >> > >