[ https://issues.apache.org/jira/browse/KAFKA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353681#comment-14353681 ]
Jiangjie Qin commented on KAFKA-2011: ------------------------------------- [~kzakee] The log you pasted shows that there are several controller fail-overs. On each controller migration, the new controller will check if there is partition undergoing preferred leader election by checking zookeeper path. The log might be a little bit confusing but even if it says "starting preferred replica leader election ...", it might doing preferred leader election for no partition. The really interesting part here is why there are several controller migrations. If you can upload the complete controller.log file we can dig it further. > Rebalance with auto.leader.rebalance.enable=false > -------------------------------------------------- > > Key: KAFKA-2011 > URL: https://issues.apache.org/jira/browse/KAFKA-2011 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2.0 > Environment: 5 Hosts of below config: > "x86_64" "32-bit, 64-bit" "Little Endian" "24 GenuineIntel CPUs Model 44 > 1600.000MHz" "RAM 189 GB" GNU/Linux > Reporter: K Zakee > Priority: Blocker > > Started with clean cluster 0.8.2 with 5 brokers. Setting the properties as > below: > auto.leader.rebalance.enable=false > controlled.shutdown.enable=true > controlled.shutdown.max.retries=1 > controlled.shutdown.retry.backoff.ms=5000 > default.replication.factor=3 > log.cleaner.enable=true > log.cleaner.threads=5 > log.cleanup.policy=delete > log.flush.scheduler.interval.ms=3000 > log.retention.minutes=1440 > log.segment.bytes=1073741824 > message.max.bytes=1000000 > num.io.threads=14 > num.network.threads=14 > num.partitions=10 > queued.max.requests=500 > num.replica.fetchers=4 > replica.fetch.max.bytes=1048576 > replica.fetch.min.bytes=51200 > replica.lag.max.messages=5000 > replica.lag.time.max.ms=30000 > replica.fetch.wait.max.ms=1000 > fetch.purgatory.purge.interval.requests=5000 > producer.purgatory.purge.interval.requests=5000 > delete.topic.enable=true > Logs show rebalance happening well up to 24 hours after the start. > [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that completed > preferred replica election: (kafka.controller.KafkaController) > … > [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > ... > [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > ... > [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting preferred replica > leader election for partitions (kafka.controller.KafkaController) > ... > [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions undergoing > preferred replica election: (kafka.controller.KafkaController) -- This message was sent by Atlassian JIRA (v6.3.4#6332)