[ 
https://issues.apache.org/jira/browse/KAFKA-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491566#comment-13491566
 ] 

Jun Rao commented on KAFKA-532:
-------------------------------

Thanks for patch v4. A few more comments:

40. PartitionStateInfo: It seems that we need to send the controllerEpoc 
associated with this partition. Note that this epoc is different from the 
controllerEpoc in LeaderAndIsrRequest. The former is the epoc of the controller 
that last changed the leader or isr and will be used when broker updates the 
isr. The latter is the epoc of the controller that sends the request and will 
be used in ReplicaManager to decide which controller's decision to follow. We 
will need to change the controllerEpoc passed to makeLeader and makeFollower in 
ReplicaManager accordingly.

41. ReplicaManager: In stopReplicas() and becomeLeaderOrFollower(), it would be 
better to only update controllerEpoch when it's truly necessary, i.e., the new 
controllerEpoch is larger than the cached one (not equal). This is because 
updating a volatile variable is a bit expensive than updating a local variable 
since the update has to be exposed to other threads.

42. KafkaController: The approach in the new patch works. There are a few 
corner cases that we need to cover.
42.1. incrementControllerEpoch(): If the controllerEpoc path doesn't exist, we 
create the path using the initial epoc version without using conditional 
update. It is possible for 2 controllers to execute this logic simultaneously 
and both get the initial epoc version. One solution is to make sure the 
controller epoc path exists during context initialization. Then we can always 
use conditional update here.
42.2. ControllerContext: We need to initialize controllerEpoc by reading from 
ZK. We also need to make sure that we subscribe to the controllerEpoc path 
first and then read its value from ZK for initialization.
42.3. ControllerEpochListener: It's safer to set both the epoc and the ZK 
version using the value from ZkUtils.readData.

43. ControllerMovedException is missing in the patch
                
> Multiple controllers can co-exist during soft failures
> ------------------------------------------------------
>
>                 Key: KAFKA-532
>                 URL: https://issues.apache.org/jira/browse/KAFKA-532
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: bugs
>         Attachments: kafka-532-v1.patch, kafka-532-v2.patch, 
> kafka-532-v3.patch, kafka-532-v4.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> If the current controller experiences an intermittent soft failure (GC pause) 
> in the middle of leader election or partition reassignment, a new controller 
> might get elected and start communicating new state change decisions to the 
> brokers. After recovering from the soft failure, the old controller might 
> continue sending some stale state change decisions to the brokers, resulting 
> in unexpected failures. We need to introduce a controller generation id that 
> increments with controller election. The brokers should reject any state 
> change requests by a controller with an older generation id.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to