[ https://issues.apache.org/jira/browse/KAFKA-13007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Kim updated KAFKA-13007: ----------------------------- Description: >From >[KafkaAdminClient#getListOffsetsCalls|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215] ``` for (Map.Entry<TopicPartition, OffsetSpec> entry: topicPartitionOffsets.entrySet()) { ... Node node = mr.cluster().leaderFor(tp); ``` here we build the cluster snapshot for each topic partition. instead, we should reuse a snapshot. this will reduce the time complexity from O( n^2 ) to O( n ). for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions to listOffsets without snapshot reuse: duration of building futures from metadata response: *15582* milliseconds total duration of listOffsets: 15743 milliseconds with reuse: duration of building futures from metadata response: *24* milliseconds total duration of listOffsets: 235 milliseconds Affects all versions since Admin & KafkaAdminClient introduced listOffsets (original PR: [https://github.com/apache/kafka/pull/7296]) was: >From >[KafkaAdminClient#getListOffsetsCalls|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215] ``` for (Map.Entry<TopicPartition, OffsetSpec> entry: topicPartitionOffsets.entrySet()) { ... Node node = mr.cluster().leaderFor(tp); ``` here we build the cluster snapshot for each topic partition. instead, we should reuse a snapshot. this will reduce the time complexity from O(n^2) to O(\n). for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions to listOffsets without snapshot reuse: duration of building futures from metadata response: *15582* milliseconds total duration of listOffsets: 15743 milliseconds with reuse: duration of building futures from metadata response: *24* milliseconds total duration of listOffsets: 235 milliseconds Affects all versions since Admin & KafkaAdminClient introduced listOffsets (original PR: [https://github.com/apache/kafka/pull/7296]) > KafkaAdminClient getListOffsetsCalls builds cluster snapshot for every topic > partition > -------------------------------------------------------------------------------------- > > Key: KAFKA-13007 > URL: https://issues.apache.org/jira/browse/KAFKA-13007 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 2.8.0 > Reporter: Jeff Kim > Assignee: Jeff Kim > Priority: Blocker > > From > [KafkaAdminClient#getListOffsetsCalls|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215] > ``` > for (Map.Entry<TopicPartition, OffsetSpec> entry: > topicPartitionOffsets.entrySet()) { > ... > Node node = mr.cluster().leaderFor(tp); > ``` > here we build the cluster snapshot for each topic partition. instead, we > should reuse a snapshot. this will reduce the time complexity from O( n^2 ) > to O( n ). > for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions > to listOffsets > without snapshot reuse: > duration of building futures from metadata response: *15582* milliseconds > total duration of listOffsets: 15743 milliseconds > with reuse: > duration of building futures from metadata response: *24* milliseconds > total duration of listOffsets: 235 milliseconds > Affects all versions since Admin & KafkaAdminClient introduced listOffsets > (original PR: [https://github.com/apache/kafka/pull/7296]) -- This message was sent by Atlassian Jira (v8.3.4#803005)