[ 
https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113645#comment-16113645
 ] 

Onur Karaman commented on KAFKA-1120:
-------------------------------------

Alright I might know what's happening. Here's the red flag:
{code}
> grep -r "Newly added brokers" .
./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:40:09,121] INFO 
[Controller 1]: Newly added brokers: 1, deleted brokers: , all live brokers: 1 
(kafka.controller.KafkaController)
./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:40:27,172] INFO 
[Controller 1]: Newly added brokers: 2, deleted brokers: , all live brokers: 
1,2 (kafka.controller.KafkaController)
./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:47:15,215] INFO 
[Controller 1]: Newly added brokers: , deleted brokers: , all live brokers: 1,2 
(kafka.controller.KafkaController)
./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:47:17,927] INFO 
[Controller 1]: Newly added brokers: , deleted brokers: , all live brokers: 1,2 
(kafka.controller.KafkaController)
{code}

Here's the relevant code in BrokerChange.process:
{code}
val curBrokers = zkUtils.getAllBrokersInCluster().toSet
val curBrokerIds = curBrokers.map(_.id)
val liveOrShuttingDownBrokerIds = controllerContext.liveOrShuttingDownBrokerIds
val newBrokerIds = curBrokerIds -- liveOrShuttingDownBrokerIds
val deadBrokerIds = liveOrShuttingDownBrokerIds -- curBrokerIds
{code}

Basically the ControlledShutdown event took so long to process that the 
BrokerChange corresponding to the killed broker (3rd BrokerChange in the above 
snippet) and BrokerChange corresponding to the restarted broker (4th 
BrokerChange in the above snippet) are queued up waiting for 
ControlledShutdown's completion. By the time these BrokerChange events get 
processed, the restarted broker is already registered in zookeeper, causing the 
broker to appear in both controllerContext.liveOrShuttingDownBrokerIds and the 
brokers listed in zookeeper. This means the controller will not execute the 
onBrokerFailure in the 3rd BrokerChange and will also not execute onBrokerJoin 
in the 4th BrokerChange.

I'm not sure of the fix. Broker generations as defined in the redesign doc in 
KAFKA-5027 would work but I'm not sure if it's strictly required.

> Controller could miss a broker state change 
> --------------------------------------------
>
>                 Key: KAFKA-1120
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1120
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.1
>            Reporter: Jun Rao
>              Labels: reliability
>             Fix For: 1.0.0
>
>
> When the controller is in the middle of processing a task (e.g., preferred 
> leader election, broker change), it holds a controller lock. During this 
> time, a broker could have de-registered and re-registered itself in ZK. After 
> the controller finishes processing the current task, it will start processing 
> the logic in the broker change listener. However, it will see no broker 
> change and therefore won't do anything to the restarted broker. This broker 
> will be in a weird state since the controller doesn't inform it to become the 
> leader of any partition. Yet, the cached metadata in other brokers could 
> still list that broker as the leader for some partitions. Client requests 
> routed to that broker will then get a TopicOrPartitionNotExistException. This 
> broker will continue to be in this bad state until it's restarted again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to