[jira] [Updated] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer
[ https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hawkins updated KAFKA-7878: Affects Version/s: 3.7.0 > Connect Task already exists in this worker when failed to create consumer > - > > Key: KAFKA-7878 > URL: https://issues.apache.org/jira/browse/KAFKA-7878 > Project: Kafka > Issue Type: Bug > Components: connect >Affects Versions: 1.0.1, 2.0.1, 3.7.0 >Reporter: Loïc Monney >Priority: Major > > *Assumption* > 1. DNS is not available during a few minutes > 2. Consumer group rebalances > 3. Client is not able to resolve DNS entries anymore and fails > 4. Task seems already registered, so at next rebalance the task will fail due > to *Task already exists in this worker* and the only way to recover is to > restart the connect process > *Real log entries* > * Distributed cluster running one connector on top of Kubernetes > * Connect 2.0.1 > * kafka-connect-hdfs 5.0.1 > {noformat} > [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from > bootstrap.servers as DNS resolution failed for kafka.xxx.net > (org.apache.kafka.clients.ClientUtils:56) > [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed > initialization and will not be started. > (org.apache.kafka.connect.runtime.WorkerSinkTask:142) > org.apache.kafka.connect.errors.ConnectException: Failed to create consumer > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka > consumer > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474) > ... 10 more > Caused by: org.apache.kafka.common.config.ConfigException: No resolvable > bootstrap urls given in bootstrap.servers > at > org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709) > ... 13 more > [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868) > [2019-01-28 13:31:25,926] INFO Rebalance started > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239) > [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 > (org.apache.kafka.connect.runtime.Worker:555) > [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for > rebalance > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269) > [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] (Re-)joining group > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509) > [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] Successfully joined group with generation 29 > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473) > [2019-01-28 13:31:30,746] INFO Joined group and got assignment: > Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', > leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], > taskIds=[xxx-22]} > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217) > [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config > offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858) > [2019-01-28 13:31:30,747] INFO Starting task xxx-22 > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:872) > [2019-01-28 13:31:30,747] INFO Creating task xxx-22 > (org.apache.kafka.connect.runtime.Worker:396) > [2019-01-28 13:31:30,748] ERROR
[jira] [Updated] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer
[ https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Harris updated KAFKA-7878: --- Component/s: KafkaConnect > Connect Task already exists in this worker when failed to create consumer > - > > Key: KAFKA-7878 > URL: https://issues.apache.org/jira/browse/KAFKA-7878 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.1, 2.0.1 >Reporter: Loïc Monney >Priority: Major > > *Assumption* > 1. DNS is not available during a few minutes > 2. Consumer group rebalances > 3. Client is not able to resolve DNS entries anymore and fails > 4. Task seems already registered, so at next rebalance the task will fail due > to *Task already exists in this worker* and the only way to recover is to > restart the connect process > *Real log entries* > * Distributed cluster running one connector on top of Kubernetes > * Connect 2.0.1 > * kafka-connect-hdfs 5.0.1 > {noformat} > [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from > bootstrap.servers as DNS resolution failed for kafka.xxx.net > (org.apache.kafka.clients.ClientUtils:56) > [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed > initialization and will not be started. > (org.apache.kafka.connect.runtime.WorkerSinkTask:142) > org.apache.kafka.connect.errors.ConnectException: Failed to create consumer > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka > consumer > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474) > ... 10 more > Caused by: org.apache.kafka.common.config.ConfigException: No resolvable > bootstrap urls given in bootstrap.servers > at > org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709) > ... 13 more > [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868) > [2019-01-28 13:31:25,926] INFO Rebalance started > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239) > [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 > (org.apache.kafka.connect.runtime.Worker:555) > [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for > rebalance > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269) > [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] (Re-)joining group > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509) > [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] Successfully joined group with generation 29 > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473) > [2019-01-28 13:31:30,746] INFO Joined group and got assignment: > Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', > leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], > taskIds=[xxx-22]} > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217) > [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config > offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858) > [2019-01-28 13:31:30,747] INFO Starting task xxx-22 > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:872) > [2019-01-28 13:31:30,747] INFO Creating task xxx-22 > (org.apache.kafka.connect.runtime.Worker:396) > [2019-01-28 13:31:30,748] ERROR Cou