Hi,
My Kafka version is 0.8.2.2 Replica factor is
2. auto.leader.rebalance.enable=true
I stopped a broker in my cluster. After a few minutes I started this
broker. The broker was busy catching up huge lag and reached 120MB/s disk
write limit. Additionally there are 23 partitions whose only undead replica
lives on this broker. So shortly after this broker was up, it became the
leader of the 23 partitions. After about 10 minutes this broker was up, all
brokers' network and message in started to fall. Messages in dropped to
close to zero after another 5 minutes. It was only when I stopped this
broker that producers recovered to send messages to cluster again.
1. When a broker reaches 120MB/s disk write limit, how will it affect other
brokers and producers?
2. What caused other brokers' or producers' problem?
3. If broker 3 became leaders of NO partitions after I restarted it, would
it still cause problem even if it reached disk write limit when it was busy
catching up as follows?
4. If question 3's answer is 'yes', what should I do to safely restart a
broker with huge traffic?
5. Does auto.leader.rebalance.enable=true do anything harmful?
Thanks