[ 
https://issues.apache.org/jira/browse/KAFKA-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated KAFKA-9352:
------------------------------
    Description: 
originally, when mirrormaker replicates a group of topics, the assignment 
between topic-partition and tasks are pretty static. E.g. partitions from the 
same topic tend to be grouped together as much as possible on the same task. 
For example, 3 tasks to mirror 3 topics with 8, 2 and 2
partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
partition 5'

The original assignment will look like:

t1 -> [t0p0, t0p1, t0p2, t0p3]
t2 -> [t0p4, t0p5, t0p6, t0p7]
t3 -> [t1p0, t1p2, t2p0, t2p1]

The potential issue of above assignment is: if topic 0 has more traffic than 
other topics (topic 1, topic 2), t1 and t2 will be loaded more traffic than t3. 
When the tasks are mapped to the mirrormaker instances (workers) and launched, 
it will create unbalanced load on the workers. Please see the picture below as 
an unbalanced example of 2 mirrormaker instances:

!Screen Shot 2019-12-19 at 12.16.02 PM.png!

Given each mirrored topic has different traffic and number of partitions, to 
balance the load
across all mirrormaker instances (workers), 'roundrobin' helps to evenly assign 
all
topic-partition to the tasks, then the tasks are further distributed to workers 
by calling
'ConnectorUtils.groupPartitions()'. For example, 3 tasks to mirror 3 topics 
with 8, 2 and 2
partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
partition 5'
t1 -> [t0p0, t0p3, t0p6, t1p1]
t2 -> [t0p1, t0p4, t0p7, t2p0]
t3 -> [t0p2, t0p5, t1p0, t2p1]

The improvement of this new above assignment over the original assignment is: 
the partitions of topic 0, topic 1 and topic 2 are all spread over all tasks, 
which creates a relatively even load on all workers, after the tasks are mapped 
to the workers and launched.
Please see the picture below as a balanced example of 4 mirrormaker instances:

!Screen Shot 2019-12-19 at 8.22.17 AM.png!

PR link is: https://github.com/apache/kafka/pull/7880

 

  was:
originally, when mirrormaker replicates a group of topics, the assignment 
between topic-partition and tasks are pretty static. E.g. partitions from the 
same topic tend to be grouped together as much as possible on the same task. 
For example, 3 tasks to mirror 3 topics with 8, 2 and 2
partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
partition 5'

The original assignment will look like:

t1 -> [t0p0, t0p1, t0p2, t0p3]
t2 -> [t0p4, t0p5, t0p6, t0p7]
t3 -> [t1p0, t1p2, t2p0, t2p1]

The potential issue of above assignment is: if topic 0 has more traffic than 
other topics (topic 1, topic 2), t1 and t2 will be loaded more traffic than t3. 
When the tasks are mapped to the mirrormaker instances (workers) and launched, 
it will create unbalanced load on the workers. Please see the picture below as 
an unbalanced example of 2 mirrormaker instances:

!Screen Shot 2019-12-19 at 12.16.02 PM.png!

Given each mirrored topic has different traffic and number of partitions, to 
balance the load
across all mirrormaker instances (workers), 'roundrobin' helps to evenly assign 
all
topic-partition to the tasks, then the tasks are further distributed to workers 
by calling
'ConnectorUtils.groupPartitions()'. For example, 3 tasks to mirror 3 topics 
with 8, 2 and 2
partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
partition 5'
t1 -> [t0p0, t0p3, t0p6, t1p1]
t2 -> [t0p1, t0p4, t0p7, t2p0]
t3 -> [t0p2, t0p5, t1p0, t2p1]

The improvement of this new above assignment over the original assignment is: 
the partitions of topic 0, topic 1 and topic 2 are all spread over all tasks, 
which creates a relatively even load on all workers, after the tasks are mapped 
to the workers and launched.
Please see the picture below as a balanced example of 4 mirrormaker instances:

!Screen Shot 2019-12-19 at 8.22.17 AM.png!

 


> unbalanced assignment of topic-partition to tasks
> -------------------------------------------------
>
>                 Key: KAFKA-9352
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9352
>             Project: Kafka
>          Issue Type: Improvement
>          Components: mirrormaker
>    Affects Versions: 2.4.0
>            Reporter: Ning Zhang
>            Priority: Major
>             Fix For: 2.5.0
>
>         Attachments: Screen Shot 2019-12-19 at 12.16.02 PM.png, Screen Shot 
> 2019-12-19 at 8.22.17 AM.png
>
>
> originally, when mirrormaker replicates a group of topics, the assignment 
> between topic-partition and tasks are pretty static. E.g. partitions from the 
> same topic tend to be grouped together as much as possible on the same task. 
> For example, 3 tasks to mirror 3 topics with 8, 2 and 2
> partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
> partition 5'
> The original assignment will look like:
> t1 -> [t0p0, t0p1, t0p2, t0p3]
> t2 -> [t0p4, t0p5, t0p6, t0p7]
> t3 -> [t1p0, t1p2, t2p0, t2p1]
> The potential issue of above assignment is: if topic 0 has more traffic than 
> other topics (topic 1, topic 2), t1 and t2 will be loaded more traffic than 
> t3. When the tasks are mapped to the mirrormaker instances (workers) and 
> launched, it will create unbalanced load on the workers. Please see the 
> picture below as an unbalanced example of 2 mirrormaker instances:
> !Screen Shot 2019-12-19 at 12.16.02 PM.png!
> Given each mirrored topic has different traffic and number of partitions, to 
> balance the load
> across all mirrormaker instances (workers), 'roundrobin' helps to evenly 
> assign all
> topic-partition to the tasks, then the tasks are further distributed to 
> workers by calling
> 'ConnectorUtils.groupPartitions()'. For example, 3 tasks to mirror 3 topics 
> with 8, 2 and 2
> partitions respectively. 't1' denotes 'task 1', 't0p5' denotes 'topic 0, 
> partition 5'
> t1 -> [t0p0, t0p3, t0p6, t1p1]
> t2 -> [t0p1, t0p4, t0p7, t2p0]
> t3 -> [t0p2, t0p5, t1p0, t2p1]
> The improvement of this new above assignment over the original assignment is: 
> the partitions of topic 0, topic 1 and topic 2 are all spread over all tasks, 
> which creates a relatively even load on all workers, after the tasks are 
> mapped to the workers and launched.
> Please see the picture below as a balanced example of 4 mirrormaker instances:
> !Screen Shot 2019-12-19 at 8.22.17 AM.png!
> PR link is: https://github.com/apache/kafka/pull/7880
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to