[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2024-04-22 Thread John Hawkins (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839784#comment-17839784
 ] 

John Hawkins commented on KAFKA-7878:
-

I see this quite often when I'm loading the JDBC connector. V. Annoying !

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
> [2019-01-28 13:31:30,747] INFO Starting task xxx-22 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:872)
> [2019-01-28 13:31:30,747] INFO Creating

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2021-04-19 Thread Daniel Huang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325411#comment-17325411
 ] 

Daniel Huang commented on KAFKA-7878:
-

Also seeing a similar "Task already exists in this worker" error in 2.6.0. Have 
observed it in both sink and source connectors.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
> [2019-01-28 13:31:30,747] INFO Starting task xxx-22 
> (org.apache.kafka.connect.runtime.distributed.Dist

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2020-12-07 Thread Loy Larvin Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245659#comment-17245659
 ] 

Loy Larvin Rao commented on KAFKA-7878:
---

Seeing the same issue in 2.6.0.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
> [2019-01-28 13:31:30,747] INFO Starting task xxx-22 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:872)
> [2019-01-28 13:31:30,747] INFO Creating task xxx-22 
> (org.apache.kafka

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2020-07-27 Thread Nitika Raj (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166089#comment-17166089
 ] 

Nitika Raj commented on KAFKA-7878:
---

We faced similar problems but then tracking back in logs we found the actual 
issue to be as https://issues.apache.org/jira/browse/KAFKA-9385 &; thus 
https://issues.apache.org/jira/browse/KAFKA-9184
which has been resolved in AK 2.3.2/2.4.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2019-09-09 Thread Goltseva Taisiia (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925725#comment-16925725
 ] 

Goltseva Taisiia commented on KAFKA-7878:
-

We have the similar problem. Sometimes after Kafka had problems for some time, 
error occurs during tasks restarting - *Task already exists in this worker.*

It's also similar to:

[https://stackoverflow.com/questions/43802156/inconsistent-connector-state-connectexception-task-already-exists-in-this-work]

 

Only process restarting fixes the issue.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using c

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2019-08-27 Thread Echo Xu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916779#comment-16916779
 ] 

Echo Xu commented on KAFKA-7878:


We encountered the exact same issue.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
> [2019-01-28 13:31:30,747] INFO Starting task xxx-22 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:872)
> [2019-01-28 13:31:30,747] INFO Creating task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:396)
> [2019-01-28 13