mumrah commented on code in PR #14593:
URL: https://github.com/apache/kafka/pull/14593#discussion_r1368626043


##########
metadata/src/main/java/org/apache/kafka/controller/PartitionChangeBuilder.java:
##########
@@ -93,6 +93,9 @@ public enum Election {
     private boolean eligibleLeaderReplicasEnabled;
     private int minISR;
 
+    // Whether allow electing last known leader when ELR is enabled. Note, the 
last known leader will be stored in the
+    // lastKnownElr field.
+    private boolean lastKnownLeaderEnabled = true;

Review Comment:
   When will this be set to false? Will it be based on the different types of 
unclean recovery?



##########
metadata/src/main/java/org/apache/kafka/controller/PartitionChangeBuilder.java:
##########
@@ -253,17 +269,30 @@ private ElectionResult electAnyLeader() {
         return new ElectionResult(NO_LEADER, false);
     }
 
+    private boolean canElectLastKnownLeader() {
+        return eligibleLeaderReplicasEnabled && lastKnownLeaderEnabled && 
targetElr.isEmpty() && targetIsr.isEmpty()
+            && partition.lastKnownElr.length == 1 && 
isAcceptableLeader.test(partition.lastKnownElr[0]);
+    }
+
     private boolean isValidNewLeader(int replica) {
-        return targetIsr.contains(replica) && isAcceptableLeader.test(replica);
+        // The valid new leader should be in either ISR or in ELR when ISR is 
empty.
+        return (targetIsr.contains(replica) || (targetIsr.isEmpty() && 
targetElr.contains(replica))) && isAcceptableLeader.test(replica);
     }
 
     private void tryElection(PartitionChangeRecord record) {
         ElectionResult electionResult = electLeader();
         if (electionResult.node != partition.leader) {
             // generating log messages for partition elections can get 
expensive on large clusters,
             // so only log clean elections at TRACE level; log unclean 
elections at INFO level
-            // to ensure the message is emitted since an unclean election can 
lead to data loss.
-            if (electionResult.unclean) {
+            // to ensure the message is emitted since an unclean election can 
lead to data loss;
+            if (targetElr.contains(electionResult.node)) {
+                targetIsr = Collections.singletonList(electionResult.node);
+                targetElr = targetElr.stream().filter(replica -> replica != 
electionResult.node).collect(Collectors.toList());
+                targetLastKnownElr = targetLastKnownElr.stream()
+                    .filter(replica -> replica != 
electionResult.node).collect(Collectors.toList());

Review Comment:
   Are these used anywhere? 



##########
metadata/src/main/java/org/apache/kafka/controller/PartitionChangeBuilder.java:
##########
@@ -253,17 +269,30 @@ private ElectionResult electAnyLeader() {
         return new ElectionResult(NO_LEADER, false);
     }
 
+    private boolean canElectLastKnownLeader() {

Review Comment:
   Not necessarily for this PR, but I wonder how we can surface more of our 
logic results to the operator. In this case, we're putting a lot of logic 
behind one boolean. It might be nice if we could attach more information to 
ElectionResult throughout the `tryElection` code.
   
   For example, if a user expects the last known leader to be elected based on 
their configs, but a LKL is _not_ elected -- how can they debug the situation? 



##########
metadata/src/main/java/org/apache/kafka/controller/PartitionChangeBuilder.java:
##########
@@ -253,17 +269,30 @@ private ElectionResult electAnyLeader() {
         return new ElectionResult(NO_LEADER, false);
     }
 
+    private boolean canElectLastKnownLeader() {
+        return eligibleLeaderReplicasEnabled && lastKnownLeaderEnabled && 
targetElr.isEmpty() && targetIsr.isEmpty()
+            && partition.lastKnownElr.length == 1 && 
isAcceptableLeader.test(partition.lastKnownElr[0]);

Review Comment:
   Making sure I follow this. We can elect the last known leader if:
   
   * ELR is enabled
   * Last known leader election is enabled (how does this get set?)
   * target ELR and ISR is empty (meaning the last replica with the HWM has 
gone away?) 
   * previous ELR had just a single member
   
   Do we have an invariant that the ELR can only change by a single replica in 
one PartitionChangeRecord? For example, could we have an empty ISR and empty 
ELR, but have two replicas in `lastKnownElr`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to