Hi guys,
today I have observed a very strange behavior of the auto leader rebalance
feature after I used the reassign partitions tool.
For some reason only the first two of my six brokers are now used as
leaders.
Example:
# ./kafka-topics.sh --zookeeper xxx --describe --topic Search
Topic:Search PartitionCount:10 ReplicationFactor:3 Configs:
Topic: Search Partition: 0 Leader: 1 Replicas: 1,3,5 Isr:
5,3,1
Topic: Search Partition: 1 Leader: 2 Replicas: 2,4,6 Isr:
6,4,2
Topic: Search Partition: 2 Leader: 1 Replicas: 1,3,5 Isr:
5,3,1
Topic: Search Partition: 3 Leader: 2 Replicas: 2,4,6 Isr:
2,6,4
Topic: Search Partition: 4 Leader: 1 Replicas: 1,3,5 Isr:
3,5,1
Topic: Search Partition: 5 Leader: 2 Replicas: 2,4,6 Isr:
4,2,6
Topic: Search Partition: 6 Leader: 1 Replicas: 1,3,5 Isr:
5,3,1
Topic: Search Partition: 7 Leader: 2 Replicas: 2,4,6 Isr:
6,2,4
Topic: Search Partition: 8 Leader: 1 Replicas: 1,3,5 Isr:
5,3,1
Topic: Search Partition: 9 Leader: 2 Replicas: 2,4,6 Isr:
6,2,4
Prior to the partition reassignment it looked like this (for that topic,
multiple topics were updated with one partition reassignment call):
Topic:Search PartitionCount:10 ReplicationFactor:3 Configs:
Topic: Search Partition: 0 Leader: 5 Replicas: 1,3,5 Isr:
5,3,1
Topic: Search Partition: 1 Leader: 6 Replicas: 2,4,6 Isr:
6,4,2
Topic: Search Partition: 2 Leader: 1 Replicas: 1,3,5 Isr:
1,5,3
Topic: Search Partition: 3 Leader: 2 Replicas: 2,4,6 Isr:
2,6,4
Topic: Search Partition: 4 Leader: 3 Replicas: 1,3,5 Isr:
1,3,5
Topic: Search Partition: 5 Leader: 4 Replicas: 2,4,6 Isr:
4,2,6
Topic: Search Partition: 6 Leader: 5 Replicas: 1,3,5 Isr:
5,1,3
Topic: Search Partition: 7 Leader: 6 Replicas: 2,4,6 Isr:
6,2,4
Topic: Search Partition: 8 Leader: 1 Replicas: 1,3,5 Isr:
5,1,3
Topic: Search Partition: 9 Leader: 2 Replicas: 2,4,6 Isr:
6,2,4
And I would expect to see a similar behavior now
But even if I manually shut down broker 1 and thus force a new leader
election the situation only changes temporarily:
Topic:Search PartitionCount:10 ReplicationFactor:3 Configs:
Topic: Search Partition: 0 Leader: 5 Replicas: 1,3,5 Isr: 5,3
Topic: Search Partition: 1 Leader: 2 Replicas: 2,4,6 Isr:
6,4,2
Topic: Search Partition: 2 Leader: 5 Replicas: 1,3,5 Isr: 5,3
Topic: Search Partition: 3 Leader: 2 Replicas: 2,4,6 Isr:
2,6,4
Topic: Search Partition: 4 Leader: 3 Replicas: 1,3,5 Isr: 3,5
Topic: Search Partition: 5 Leader: 2 Replicas: 2,4,6 Isr:
4,2,6
Topic: Search Partition: 6 Leader: 5 Replicas: 1,3,5 Isr: 5,3
Topic: Search Partition: 7 Leader: 2 Replicas: 2,4,6 Isr:
6,2,4
Topic: Search Partition: 8 Leader: 5 Replicas: 1,3,5 Isr: 5,3
Topic: Search Partition: 9 Leader: 2 Replicas: 2,4,6 Isr:
6,2,4
As soon as I then start broker 1 again, I see the same picture as in the
beginning (only broker 1 and 2 being leaders for any of my partitions).
Even if I wait an hour, the picture still looks the same.
If I stop both, broker 1 and broker 2, I see broker 5 and 6 getting most
of the leader roles in the cluster (together they are then the leaders for
51 of my 70 partitions), so even then it looks bad. Once I start broker 1
and 2 again they will take over the leader roles for all partitions again.
Any ideas?
Configuration excerpt:
auto.leader.rebalance.enable=true
leader.imbalance.check.interval.seconds=300
leader.imbalance.per.broker.percentage=10
unclean.leader.election.enable=false
default.replication.factor=3
num.partitions=10
...
I am using Kafka 0.8.2.1 on RHEL6.6 boxes with 7 topics with 10 partitions
each, 6 brokers and 3 zookeeper servers.
Greetings
Valentin