[ 
https://issues.apache.org/jira/browse/KAFKA-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-532:
--------------------------------

    Attachment: kafka-532-v1.patch

Introduced a controller generation/epoch that increments after a successful 
controller election

Changes include -
1. Guard zookeeper writes by the controller with the controller epoch. This 
includes initializing a leader/isr path, e
lecting leader for a partition and shrinking the isr for a partition
2. Include controller epoch in the state change requests sent to the broker
3. Include logic to discard state change requests with a stale controller epoch 
on the brokers

Testing
LeaderElectionTest: Added a unit test to send leader/isr request with a stale 
controller epoch and check if the bro
ker discards the request and sends back the appropriate error code 
(StaleControllerEpochCode)

                
> Multiple controllers can co-exist during soft failures
> ------------------------------------------------------
>
>                 Key: KAFKA-532
>                 URL: https://issues.apache.org/jira/browse/KAFKA-532
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: bugs
>         Attachments: kafka-532-v1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> If the current controller experiences an intermittent soft failure (GC pause) 
> in the middle of leader election or partition reassignment, a new controller 
> might get elected and start communicating new state change decisions to the 
> brokers. After recovering from the soft failure, the old controller might 
> continue sending some stale state change decisions to the brokers, resulting 
> in unexpected failures. We need to introduce a controller generation id that 
> increments with controller election. The brokers should reject any state 
> change requests by a controller with an older generation id.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to