Loïc Monney created KAFKA-7878:
----------------------------------
Summary: Connect Task already exists in this worker when failed to
create consumer
Key: KAFKA-7878
URL: https://issues.apache.org/jira/browse/KAFKA-7878
Project: Kafka
Issue Type: Bug
Affects Versions: 2.0.1, 1.0.1
Reporter: Loïc Monney
*Assumption*
1. DNS is not available during a few minutes
2. Consumer group rebalances
3. Client is not able to resolve DNS entries anymore and fails
4. Task seems already registered, so at next rebalance the task will fail due
to *Task already exists in this worker* and the only way to recover is to
restart the connect process
*Real log entries*
* Distributed cluster running one connector on top of Kubernetes
* Connect 2.0.1
* kafka-connect-hdfs 5.0.1
{noformat}
[2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from
bootstrap.servers as DNS resolution failed for kafka.xxx.net
(org.apache.kafka.clients.ClientUtils:56)
[2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed
initialization and will not be started.
(org.apache.kafka.connect.runtime.WorkerSinkTask:142)
org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
at
org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka
consumer
at
org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:799)
at
org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:615)
at
org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:596)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
... 10 more
Caused by: org.apache.kafka.common.config.ConfigException: No resolvable
bootstrap urls given in bootstrap.servers
at
org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
at
org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:709)
... 13 more
[2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
[2019-01-28 13:31:25,926] INFO Rebalance started
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
[2019-01-28 13:31:25,927] INFO Stopping task xxx-22
(org.apache.kafka.connect.runtime.Worker:555)
[2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for
rebalance (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
[2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, groupId=xxx-cluster]
(Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
[2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, groupId=xxx-cluster]
Successfully joined group with generation 29
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
[2019-01-28 13:31:30,746] INFO Joined group and got assignment:
Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e',
leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[],
taskIds=[xxx-22]}
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
[2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config
offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
[2019-01-28 13:31:30,747] INFO Starting task xxx-22
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:872)
[2019-01-28 13:31:30,747] INFO Creating task xxx-22
(org.apache.kafka.connect.runtime.Worker:396)
[2019-01-28 13:31:30,748] ERROR Couldn't instantiate task xxx-22 because it has
an invalid task configuration. This task will not execute until reconfigured.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:890)
org.apache.kafka.connect.errors.ConnectException: Task already exists in this
worker: xxx-22
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:399)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-01-28 13:31:30,749] INFO Finished starting connectors and tasks
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)