[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-6794:
-----------------------------------
    Description: 
Say you have a replication factor of 4 and you trigger a reassignment which 
moves all replicas to new brokers. Now 8 replicas are fetching at the same time 
which means you need to account for 8 times the current producer load plus the 
catch-up replication. To make matters worse, the replicas won't all become 
in-sync at the same time; in the worst case, you could have 7 replicas in-sync 
while one is still catching up. Currently, the old replicas won't be disabled 
until all new replicas are in-sync. This makes configuring the throttle tricky 
since ISR traffic is not subject to it.

Rather than trying to bring all 4 new replicas online at the same time, a 
friendlier approach would be to do it incrementally: bring one replica online, 
bring it in-sync, then remove one of the old replicas. Repeat until all 
replicas have been changed. This would reduce the impact of a reassignment and 
make configuring the throttle easier at the cost of a slower overall 
reassignment.

  was:
Say you have a replication factor of 4 and you trigger a reassignment which 
moves all replicas to new brokers. Now 8 replicas are fetching at the same time 
which means you need to account for 8 times the current load plus the catch-up 
replication. To make matters worse, the replicas won't all become in-sync at 
the same time; in the worst case, you could have 7 replicas in-sync while one 
is still catching up. Currently, the old replicas won't be disabled until all 
new replicas are in-sync. This makes configuring the throttle tricky since ISR 
traffic is not subject to it.

Rather than trying to bring all 4 new replicas online at the same time, a 
friendlier approach would be to do it incrementally: bring one replica online, 
bring it in-sync, then remove one of the old replicas. Repeat until all 
replicas have been changed. This would reduce the impact of a reassignment and 
make configuring the throttle easier at the cost of a slower overall 
reassignment.


> Support for incremental replica reassignment
> --------------------------------------------
>
>                 Key: KAFKA-6794
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6794
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jason Gustafson
>            Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to