[jira] [Resolved] (KAFKA-15149) Fix not sending UMR and LISR RPCs in dual-write mode when there are new partitions
[ https://issues.apache.org/jira/browse/KAFKA-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Grant resolved KAFKA-15149. -- Resolution: Fixed > Fix not sending UMR and LISR RPCs in dual-write mode when there are new > partitions > -- > > Key: KAFKA-15149 > URL: https://issues.apache.org/jira/browse/KAFKA-15149 > Project: Kafka > Issue Type: Bug >Reporter: Andrew Grant >Assignee: Andrew Grant >Priority: Major > Fix For: 3.5.1 > > > In AK in {{KRaftMigrationZkWriter}} > [here|https://github.com/apache/kafka/blame/trunk/metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationZkWriter.java#L294] > we keep references to both the new and changed partitions maps from the > {{TopicsDelta}} instance. We mutate {{changedPartitions}} resulting in > possibly mutating the {{TopicsDelta}} instance that is provided as input to > the method. After making the ZK writes when we try and figure out the UMR and > LISR requests we need to make in > {{MigrationPropagator.sendRPCsToBrokersFromMetadataDelta}} the > {{TopicsDelta}} has lost the changed partitions metadata. As a result, we > might not send the expected UMR and LISR requests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15149) Fix not sending RPCs in dual-write mode when there are new partitions
Andrew Grant created KAFKA-15149: Summary: Fix not sending RPCs in dual-write mode when there are new partitions Key: KAFKA-15149 URL: https://issues.apache.org/jira/browse/KAFKA-15149 Project: Kafka Issue Type: Bug Reporter: Andrew Grant -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14791) Create a builder class for PartitionRegistration
[ https://issues.apache.org/jira/browse/KAFKA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Grant resolved KAFKA-14791. -- Resolution: Fixed > Create a builder class for PartitionRegistration > > > Key: KAFKA-14791 > URL: https://issues.apache.org/jira/browse/KAFKA-14791 > Project: Kafka > Issue Type: Improvement >Reporter: Andrew Grant >Assignee: Andrew Grant >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14829) Consolidate reassignment logic in PartitionReassignmentReplicas
Andrew Grant created KAFKA-14829: Summary: Consolidate reassignment logic in PartitionReassignmentReplicas Key: KAFKA-14829 URL: https://issues.apache.org/jira/browse/KAFKA-14829 Project: Kafka Issue Type: Improvement Reporter: Andrew Grant -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14791) Create a builder class for PartitionRegistration
Andrew Grant created KAFKA-14791: Summary: Create a builder class for PartitionRegistration Key: KAFKA-14791 URL: https://issues.apache.org/jira/browse/KAFKA-14791 Project: Kafka Issue Type: Improvement Reporter: Andrew Grant -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14608) Make sure reassignment does not cause under min ISR
Andrew Grant created KAFKA-14608: Summary: Make sure reassignment does not cause under min ISR Key: KAFKA-14608 URL: https://issues.apache.org/jira/browse/KAFKA-14608 Project: Kafka Issue Type: Improvement Reporter: Andrew Grant -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14386) Change ReplicaPlacer place method to return a class instead of list of list of integers
[ https://issues.apache.org/jira/browse/KAFKA-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Grant resolved KAFKA-14386. -- Resolution: Fixed > Change ReplicaPlacer place method to return a class instead of list of list > of integers > --- > > Key: KAFKA-14386 > URL: https://issues.apache.org/jira/browse/KAFKA-14386 > Project: Kafka > Issue Type: Improvement >Reporter: Andrew Grant >Assignee: Andrew Grant >Priority: Major > > In KRaft mode, a new interface was introduced, > [ReplicaPlacer|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/metadata/placement/ReplicaPlacer.java], > that is used by ReplicationControlManager to create partition assignments > during both CreateTopics and CreatePartitions RPCs. Right now it has one > implementation, StripedReplicaPlacer. > Currently, it has a method called place that returns a list of list of > integers: > {code:java} > List> place( > PlacementSpec placement, > ClusterDescriber cluster > ) throws InvalidReplicationFactorException;{code} > The index corresponds to the partition ID and the integers are the replicas > of the assignment. The suggestion is to update the interface so that it > models topic and partitions more explicitly. I'm thinking something like: > {code:java} > TopicAssignment place( > PlacementSpec placement, > ClusterDescriber cluster > ) throws InvalidReplicationFactorException;{code} > where we have > {code:java} > public class TopicAssignment { > private List assignments; > public TopicAssignment(List assignments) { > this.assignments = assignments; > } > public List assignments() { > return assignments; > } > }{code} > and > {code:java} > public class PartitionAssignment { > private List replicas; > public PartitionAssignment(List replicas) { > this.replicas = replicas; > } > public List replicas() { > return replicas; > } > }{code} > There are two reasons for the suggestion. First, as mentioned above, it will > make the interface, arguably, a bit more readable and understandable by > explicitly modeling the idea of topic and partition. Second and more > importantly, it makes the interface more extendable in the future. Right now > it would be challenging to add more metadata to the response. By having > classes, we can easily add fields to them without breaking/changing the > interface. For example, in the CreatePartitions RPC we are adding partitions > to an existing topic and we might want to add some metadata to response > making it clear which partition the assignment starts at which could look > something like: > {code:java} > public class TopicAssignment { > private List assignments; > private Integer firstPartitionId; > > public TopicAssignment(List assignments, Integer > firstPartitionId) { > this.assignments = assignments; > this.firstPartitionId = firstPartitionId; > } > public List assignments() { > return assignments; > } > ... > } > {code} > > Curious to hear other folks thoughts on this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14456) Fix AdminUtils startIndex for rack aware partition creations
Andrew Grant created KAFKA-14456: Summary: Fix AdminUtils startIndex for rack aware partition creations Key: KAFKA-14456 URL: https://issues.apache.org/jira/browse/KAFKA-14456 Project: Kafka Issue Type: Improvement Reporter: Andrew Grant When new partitions are added/created we calculate a start index based off all the brokers here [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/zk/AdminZkClient.scala#L270.] That start index is passed through to AdminUtils and is used to find a starting position in the list of brokers for making assignments. However, when we make rack aware assignments we use that index into a rack alternating list here [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala#L160.] The meaning of the index gets lost: the index into the full list of brokers doesnt seem to have the same meaning as the index into a rack alternating list. I discovered this when I published [https://github.com/apache/kafka/pull/12943/files.] In that PR I added a test testRackAwarePartitionAssignment which does not work for ZK mode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14386) Change ReplicaPlacer to return a class instead of list of list of integers
Andrew Grant created KAFKA-14386: Summary: Change ReplicaPlacer to return a class instead of list of list of integers Key: KAFKA-14386 URL: https://issues.apache.org/jira/browse/KAFKA-14386 Project: Kafka Issue Type: Improvement Reporter: Andrew Grant Assignee: Andrew Grant In KRaft mode, a new interface was introduced, [ReplicaPlacer|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/metadata/placement/ReplicaPlacer.java], that is used by ReplicationControlManager to create partition assignments during both CreateTopics and CreatePartitions RPCs. Right now it has one implementation, StripedReplicaPlacer. Currently, it has a method called place that returns a list of list of integers: {code:java} List> place( PlacementSpec placement, ClusterDescriber cluster ) throws InvalidReplicationFactorException;{code} The index corresponds to the partition ID and the integers are the replicas of the assignment. The suggestion is to update the interface so that it models topic and partitions more explicitly. I'm thinking something like: {code:java} TopicAssignment place( PlacementSpec placement, ClusterDescriber cluster ) throws InvalidReplicationFactorException;{code} where we have {code:java} public class TopicAssignment { private List assignments; public TopicAssignment(List assignments) { this.assignments = assignments; } public List assignments() { return assignments; } }{code} and {code:java} public class PartitionAssignment { private List replicas; public PartitionAssignment(List replicas) { this.replicas = replicas; } public PartitionAssignment(List replicas) { this.replicas = replicas; } public List replicas() { return replicas; } }{code} There are two reasons for the suggestion. First, as mentioned above, it will make the interface, arguably, a bit more readable and understandable by explicitly modeling the idea of topic and partition. Second and more importantly, it makes the interface more extendable in the future. Right now it would be challenging to add more metadata to the response. By having classes, we can easily add fields to them without breaking/changing the interface. Curious to hear other folks thoughts on this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13892) Dedupe RemoveAccessControlEntryRecord in deleteAcls of AclControlManager
[ https://issues.apache.org/jira/browse/KAFKA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Grant resolved KAFKA-13892. -- Resolution: Fixed > Dedupe RemoveAccessControlEntryRecord in deleteAcls of AclControlManager > > > Key: KAFKA-13892 > URL: https://issues.apache.org/jira/browse/KAFKA-13892 > Project: Kafka > Issue Type: Bug >Reporter: Andrew Grant >Assignee: Andrew Grant >Priority: Major > > In > [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/AclControlManager.java#L143] > we loop through the ACL filters and and add RemoveAccessControlEntryRecord > records to the response list for each matching ACL. I think there's a bug > where if two filters match the same ACL, we create two > RemoveAccessControlEntryRecord records for that same ACL. This is an issue > because upon replay we throw an exception > (https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/AclControlManager.java#L195) > if the ACL is not in the in-memory data structures which will happen to the > second RemoveAccessControlEntryRecord. > Maybe we can just de-dupe both List and > List? I think something like (just showing code for > ApiMessageAndVersion): > {code:java} > private List > deDupeApiMessageAndVersion(List messages) { > return new HashSet<>(messages).stream().collect(Collectors.toList()); > }{code} > should suffice as I don't think the ordering matters within the list of > response objects. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13892) Dedupe response objects in deleteAcls of AclControlManager
Andrew Grant created KAFKA-13892: Summary: Dedupe response objects in deleteAcls of AclControlManager Key: KAFKA-13892 URL: https://issues.apache.org/jira/browse/KAFKA-13892 Project: Kafka Issue Type: Bug Reporter: Andrew Grant In [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/AclControlManager.java#L143] we loop through the ACL filters and and add RemoveAccessControlEntryRecord records to the response list for each matching ACL. I think there's a bug where if two filters match the same ACL, we create two RemoveAccessControlEntryRecord records for that same ACL. This is an issue because upon replay we throw an exception (https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/AclControlManager.java#L195) if the ACL is not in the in-memory data structures which will happen to the second RemoveAccessControlEntryRecord. Maybe we can just de-dupe both List and List? I think something like (just showing code for ApiMessageAndVersion): {code:java} private List deDupeApiMessageAndVersion(List messages) { return new HashSet<>(messages).stream().collect(Collectors.toList()); }{code} should suffice as I don't think the ordering matters within the list of response objects. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13889) Broker can't handle ACCESS_CONTROL_ENTRY_RECORD quickly followed by REMOVE_ACCESS_CONTROL_ENTRY_RECORD for same ACL
Andrew Grant created KAFKA-13889: Summary: Broker can't handle ACCESS_CONTROL_ENTRY_RECORD quickly followed by REMOVE_ACCESS_CONTROL_ENTRY_RECORD for same ACL Key: KAFKA-13889 URL: https://issues.apache.org/jira/browse/KAFKA-13889 Project: Kafka Issue Type: Bug Reporter: Andrew Grant In https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/image/AclsDelta.java#L64 we store the pending deletion in the changes map. This could override a creation that might have just happened. This is an issue because in BrokerMetadataPublisher this results in us making a removeAcl call which finally results in [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java#L203] being executed and this code throws an exception if the ACL isnt in the Map yet. If the ACCESS_CONTROL_ENTRY_RECORD event never got processed by BrokerMetadataPublisher then the ACL wont be in the Map yet. My feeling is we might want to make removeAcl idempotent in that it returns success if the ACL doesn't exist: no matter how many times removeAcl is called it returns success if the ACL is deleted. Maybe we’d just log a warning or something? Note, I dont think the AclControlManager has this issue because it doesn't batch the events like AclsDelta does. However, we still do throw a RuntimeException here [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/AclControlManager.java#L197] - maybe we should still follow the same logic (if we make the fix suggested above) and just log a warning if the ACL doesnt exist in the Map? -- This message was sent by Atlassian Jira (v8.20.7#820007)