[
https://issues.apache.org/jira/browse/KAFKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090765#comment-15090765
]
ASF GitHub Bot commented on KAFKA-3038:
---------------------------------------
GitHub user enothereska opened a pull request:
https://github.com/apache/kafka/pull/750
KAFKA-3038: use async ZK calls to speed up leader reassignment
Updated failure code path to deal specifically with issue identified at
affecting latency most.
@fpj could you have a look please?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/enothereska/kafka kafka-3038
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/750.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #750
----
commit 3be8bb68c6ccb37b77ed527cf4ff05bc80ee8e99
Author: Eno Thereska <[email protected]>
Date: 2016-01-08T16:09:38Z
Asynchronous implementation of failure path when updating Zookeeper
commit e288c5e35d151e6e8ce06eaa1076ebb2ceb2db13
Author: Eno Thereska <[email protected]>
Date: 2016-01-08T16:10:07Z
Merge remote-tracking branch 'apache-kafka/trunk' into kafka-3038
commit 3913ab76707a6ad125b4252d88bc3cdf091702ee
Author: Eno Thereska <[email protected]>
Date: 2016-01-09T18:23:33Z
Implemented top method using a CountDownLatch. Minor code cleanup
commit a40ad4e768f1c626fc6c818c28d22f0a91d33eaf
Author: Eno Thereska <[email protected]>
Date: 2016-01-09T18:24:25Z
Merge remote-tracking branch 'apache-kafka/trunk' into kafka-3038
----
> Speeding up partition reassignment after broker failure
> -------------------------------------------------------
>
> Key: KAFKA-3038
> URL: https://issues.apache.org/jira/browse/KAFKA-3038
> Project: Kafka
> Issue Type: Improvement
> Components: controller, core
> Affects Versions: 0.9.0.0
> Reporter: Eno Thereska
> Assignee: Eno Thereska
> Fix For: 0.9.0.0
>
>
> After a broker failure the controller does several writes to Zookeeper for
> each partition on the failed broker. Writes are done one at a time, in closed
> loop, which is slow especially under high latency networks. Zookeeper has
> support for batching operations (the "multi" API). It is expected that
> substituting serial writes with batched ones should reduce failure handling
> time by an order of magnitude.
> This is identified as an issue in
> https://cwiki.apache.org/confluence/display/KAFKA/kafka+Detailed+Replication+Design+V3
> (section End-to-end latency during a broker failure)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)