[ 
https://issues.apache.org/jira/browse/KAFKA-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashanth Menon updated KAFKA-574:
----------------------------------

    Attachment: KAFKA-574-v1.patch

I've attached a v1 patch for this guy - it's a relatively small change.

KafkaController:
- Removed updateLeaderAndIsrCache from onBrokerStartup because the partition 
state machine will read ZK when issuing leader and isr requests.  It's also 
unncessary as there's no guarantee that the leader won't change between issuing 
the request and all brokers receiving it.  Since each broker's local partition 
checks leaderEpoch when following/leading, reading ZK in onBrokerStartup isn't 
necessary.
- Removed updateLeaderAndIsrCache from onBrokerFailure.  After bringing all 
partitions with dead leaders offline, triggering online partitions change will 
read ZK for each partition and therefore isn't necessary for all partitions 
here.  

No tests were added as this was effectively removing duplicate code.  Ensuring 
tests pass should be good enough.  Otherwise, there is some generic cleanup and 
small optimizations here and there.  Let me know what you think.  
                
> KafkaController unnecessarily reads leaderAndIsr info from ZK
> -------------------------------------------------------------
>
>                 Key: KAFKA-574
>                 URL: https://issues.apache.org/jira/browse/KAFKA-574
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Prashanth Menon
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-574-v1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> KafkaController calls updateLeaderAndIsrCache() in onBrokerFailure(). This is 
> unnecessary since in onBrokerFailure(), we will make leader and isr change 
> anyway so there is no need to first read that information from ZK. Latency is 
> critical in onBrokerFailure() since it determines how quickly a leader can be 
> made online.
> Similarly, updateLeaderAndIsrCache() is called in onBrokerStartup() 
> unnecessarily. In this case, the controller does not change the leader or the 
> isr. It just needs to send the current leader and the isr info to the newly 
> started broker. We already cache leader in the controller. Isr in theory 
> could change any time by the leader. So, reading from ZK doesn't guarantee 
> that we can get the latest isr anyway. Instead, we just need to get the isr 
> last selected by the controller (which can be cached together with the leader 
> in the controller). If the leader epoc in a broker is at or larger than the 
> epoc in the leaderAndIsr request, the broker can just ignore it. Otherwise, 
> the leader and the isr selected by the controller should be used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to