[ https://issues.apache.org/jira/browse/KAFKA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884526#comment-16884526 ]
GEORGE LI commented on KAFKA-8663: ---------------------------------- As we can see from the original comments of the code: {code} //1. Update AR in ZK with OAR + RAR. {code} But in the actual implementation, it's doing: RAR + OAR instead (different ordering). > partition assignment would be better original_assignment + new_reassignment > during reassignments > ------------------------------------------------------------------------------------------------ > > Key: KAFKA-8663 > URL: https://issues.apache.org/jira/browse/KAFKA-8663 > Project: Kafka > Issue Type: Improvement > Components: controller, core > Affects Versions: 1.1.1, 2.3.0 > Reporter: GEORGE LI > Priority: Minor > > From my observation/experience during reassignment, the partition assignment > replica ordering gets changed. because it's OAR + RAR (original replicas > + reassignment replicas) set union. > However, it seems like the preferred leaders changed during the > reassignments. Normally if there is no cluster preferred leader election, > the leader is still the old leader. But if during the reassignments, there > is a leader election, the leadership changes. This caused some side > effects. Let's look at this example. > {code} > Topic:georgeli_test PartitionCount:8 ReplicationFactor:3 Configs: > Topic: georgeli_test Partition: 0 Leader: 1026 Replicas: > 1026,1028,1025 Isr: 1026,1028,1025 > {code} > reassignment (1026,1028,1025) => (1027,1025,1028) > {code} > Topic:georgeli_test PartitionCount:8 ReplicationFactor:4 > Configs:leader.replication.throttled.replicas=0:1026,0:1028,0:1025,follower.replication.throttled.replicas=0:1027 > Topic: georgeli_test Partition: 0 Leader: 1026 Replicas: > 1027,1025,1028,1026 Isr: 1026,1028,1025 > {code} > Notice the above: Leader remains 1026. but Replicas: 1027,1025,1028,1026. > If we run preferred leader election, it will try 1027 first, then 1025. > After 1027 is in ISR, then the final assignment will be (1027,1025,1028). > > My proposal for a minor improvement is to keep the original ordering replicas > during the reassignment (could be long for big topic/partitions). and after > all replicas in ISR, then finally set the partition assignment to New > reassignment. > {code} > val newAndOldReplicas = (reassignedPartitionContext.newReplicas ++ > controllerContext.partitionReplicaAssignment(topicPartition)).toSet > //1. Update AR in ZK with OAR + RAR. > updateAssignedReplicasForPartition(topicPartition, > newAndOldReplicas.toSeq) > {code} > above code changed to below to keep the original ordering first during > reassignment: > {code} > val newAndOldReplicas = > (controllerContext.partitionReplicaAssignment(topicPartition) ++ > reassignedPartitionContext.newReplicas).toSet > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)