GitHub user uncleGen opened a pull request:

    https://github.com/apache/kafka/pull/3894

    KAFKA-5928: Avoid redundant requests to zookeeper when reassign topic 
partition

    We mistakenly request topic level information according to partitions 
config in the assignment json file. For example 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala#L550:
    ```
    val validPartitions = proposedPartitionAssignment.filter { case (p, _) => 
validatePartition(zkUtils, p.topic, p.partition) } 
    ```
    If reassign 1000 partitions (in 10 topics), we need to request zookeeper 
1000 times here. But actually we only need to request just 10 (topics) times. 
We test a large-scale assignment, about 10K partitions. It takes tens of 
minutes. After optimization, it will reduce to less than 1minute.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uncleGen/kafka KAFKA-5928

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/3894.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3894
    
----
commit f6c30e81c7110f72e254bb9dfa81a25f951b70a1
Author: 木艮 <genmao....@alibaba-inc.com>
Date:   2017-09-19T03:01:20Z

    Avoid redundant requests to zookeeper when reassign topic partition

----


---

Reply via email to