[ 
https://issues.apache.org/jira/browse/KAFKA-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858366#comment-16858366
 ] 

kaushik srinivas commented on KAFKA-8485:
-----------------------------------------

Hi [~kkonstantine]

I see weird behavior across different runs.

One of the suggestion from the kafka dev group pointed out to this bug,

[https://issues.apache.org/jira/plugins/servlet/mobile#issue/KAFKA-7941]

So we have done a patch to our connect with the pull request being opened for 
this issue. But no help from this patch, we still see stability issues with 
connect when kafka broker goes down.

 

Now we have seen below stack trace couple of times, where in after restart of 
kafka brokers, GET requests would work fine but when tried to add a connector 
or delete a connector, sometimes we have seen this trace
{code:java}
org.apache.kafka.connect.errors.ConnectException: Error writing connector 
configuration to Kafka
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore.updateConnectorConfig(KafkaConfigBackingStore.java:334)
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore.putConnectorConfig(KafkaConfigBackingStore.java:303)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$6.call(DistributedHerder.java:555)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$6.call(DistributedHerder.java:535)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:271)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:220)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for future
at 
org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:73)
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore.updateConnectorConfig(KafkaConfigBackingStore.java:331)
... 10 more
{code}
 

Not much activity is happening in the connect logs even in debug mode except 
from seeing the above trace.

Issue is very consistent, even without data activity once kafka brokers are 
restarted (2 out of 3 brokers), we see this behavior.

> Kafka connect worker does not respond when kafka broker goes down with data 
> streaming in progress
> -------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8485
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8485
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.2.1
>            Reporter: kaushik srinivas
>            Priority: Blocker
>              Labels: performance
>
> Below is the scenario
> 3 kafka brokers are up and running.
> Kafka connect worker is installed and a hdfs sink connector is added.
> Data streaming started, data being flushed out of kafka into hdfs.
> Topic is created with 3 partitons, one leader on all the three brokers.
> Now, 2 kafka brokers are restarted. Partition re balance happens.
> Now we observe, kafka connect does not respond. REST API keeps timing out. 
> Nothing useful is being logged at the connect logs as well.
> Only way to get out of this situation currently is to restart the kafka 
> connect worker and things gets normal.
>  
> The same scenario when tried without data being in progress, works fine. 
> Meaning REST API does not get into timing out state. 
> making this issue a blocker, because of the impact due to kafka broker 
> restart.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to