[ https://issues.apache.org/jira/browse/KAFKA-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Randall Hauch resolved KAFKA-12339. ----------------------------------- Reviewer: Randall Hauch Resolution: Fixed Merged to `trunk`, and cherry-picked to: * `2.8` for inclusion in 2.8.0 (with release manager approval) * `2.7` for inclusion in 2.7.1 * `2.6` for inclusion in 2.6.2 (with release manager approval) * `2.5` for inclusion in 2.5.2 > Add retry to admin client's listOffsets > --------------------------------------- > > Key: KAFKA-12339 > URL: https://issues.apache.org/jira/browse/KAFKA-12339 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.5.2, 2.8.0, 2.7.1, 2.6.2 > Reporter: Chia-Ping Tsai > Assignee: Chia-Ping Tsai > Priority: Blocker > Fix For: 2.5.2, 2.8.0, 2.7.1, 2.6.2 > > > After upgrading our connector env to 2.9.0-SNAPSHOT, sometimes the connect > cluster encounters following error. > {quote}Uncaught exception in herder work thread, exiting: > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:324) > org.apache.kafka.connect.errors.ConnectException: Error while getting end > offsets for topic 'connect-storage-topic-connect-cluster-1' > at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:689) > at > org.apache.kafka.connect.util.KafkaBasedLog.readToLogEnd(KafkaBasedLog.java:338) > at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:195) > at > org.apache.kafka.connect.storage.KafkaStatusBackingStore.start(KafkaStatusBackingStore.java:216) > at > org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:129) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:310) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) > at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:668) > ... 10 more > {quote} > [https://github.com/apache/kafka/pull/9780] added shared admin to get end > offsets. KafkaAdmin#listOffsets does not handle topic-level error, hence the > UnknownTopicOrPartitionException on topic-level can obstruct worker from > running when the new internal topic is NOT synced to all brokers. -- This message was sent by Atlassian Jira (v8.3.4#803005)