[ 
https://issues.apache.org/jira/browse/KAFKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726975#comment-15726975
 ] 

ASF GitHub Bot commented on KAFKA-4442:
---------------------------------------

Github user lindong28 closed the pull request at:

    https://github.com/apache/kafka/pull/2167


> Controller should grab lock when it is being initialized to avoid race 
> condition
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-4442
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4442
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>
> Currently controller will register broker change listener before sending send 
> LeaderAndIsrRequests to live replicas. The call path looks like this:
> - onControllerFailover()
>   - partitionStateMachine.startup()
>     - triggerOnlinePartitionStateChange()
>       - handleStateChange(partition, OnlinePartition)
>         - electLeaderForPartition(partition)
>           - determines live replicas for this partition (step a)
>           - add partition to controllerContext.partitionLeadershipInfo. (step 
> b)
>           - send LeaderAndIsrRequest to those live replics for this partition
> However, if a broker registers itself in zookeeper in between step (a) and 
> step (b), the onBrokerStartup() will not send LeaderAndIsrRequest to this 
> broker for this partition because the partition is not found in 
> controllerContext.partitionLeadershipInfo. Yet onControllerFailover() will 
> not send LeaderAndIsrRequest to this broker for this partition either because 
> the broker is not considered live in step (a).
> The root cause is that onBrokerStartup() should only be executed after 
> controller has finished onControllerFailover() and initialized its state. 
> Therefore controller should grab the lock controllerContext.controllerLock 
> during onControllerFailover().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to