[
https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665258#comment-13665258
]
Jun Rao commented on KAFKA-911:
-------------------------------
If we just stop the replica to be shut down without sending a reduced ISR to
the leader, it will take replicaLagTimeMaxMs (defaults to 10s) before the
leader realize that the follower is gone. Before that, no new messages can be
committed. The idea of letting the controller send a reduced ISR to the leader
is to allow the leader to commit new messages sooner. Not very sure if the
existing logic does this effectively though. It seems to me that it's better if
we stop the shutdown replica one at a time after the leader is moved. Maybe
Joel can comment?
> Bug in controlled shutdown logic in controller leads to controller not
> sending out some state change request
> -------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-911
> URL: https://issues.apache.org/jira/browse/KAFKA-911
> Project: Kafka
> Issue Type: Bug
> Components: controller
> Affects Versions: 0.8
> Reporter: Neha Narkhede
> Assignee: Neha Narkhede
> Priority: Blocker
> Labels: kafka-0.8, p1
> Attachments: kafka-911-v1.patch
>
>
> The controlled shutdown logic in the controller first tries to move the
> leaders from the broker being shutdown. Then it tries to remove the broker
> from the isr list. During that operation, it does not synchronize on the
> controllerLock. This causes a race condition while dispatching data using the
> controller's channel manager.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira