Re: Kafka Connect tasks consumers issue

2016-06-14 Thread Marcos Juarez
Liquan, We're constantly hitting this problem in our prod cluster. Do you have a JIRA issue that relates to this, and when will this bugfix be backported to the 0.9.x branch? We're not planning on upgrading to 0.10 for a while, since the assumption was that the 0.9.x line would be more stable.

Re: Kafka Connect tasks consumers issue

2016-05-16 Thread Liquan Pei
Hi Matteo, There was a bug in the 0.9.1 such that task.close() can be invoked both in the Worker thread and Herder thread. There can be a race condition that consumer.close() is invoked in multiple threads at the same time. As the consumer is designed to be used in single thread, thus the concurre

Re: Kafka Connect tasks consumers issue

2016-05-15 Thread Matteo Luzzi
Any other thoughts on this? Thanks, Matteo 2016-05-12 13:09 GMT+02:00 Matteo Luzzi : > I found also this suspicious log snippet that might be revelant. The task > executed by thread 134 is the one that won't receive messages > > INFO Attempt to heart beat failed since the group is rebalancing, tr

Re: Kafka Connect tasks consumers issue

2016-05-12 Thread Matteo Luzzi
I found also this suspicious log snippet that might be revelant. The task executed by thread 134 is the one that won't receive messages INFO Attempt to heart beat failed since the group is rebalancing, try to re-join group. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:633) [201

Re: Kafka Connect tasks consumers issue

2016-05-11 Thread Matteo Luzzi
Hi, Liquan I run the two workers inside docker containers and a connector having 6 tasks. They read from a topic having 6 partitions Then I kill one of the two containers using docker kill or docker restart command. When the container is up again a rebalance happens and sometimes few tasks don't c

Re: Kafka Connect tasks consumers issue

2016-05-11 Thread Liquan Pei
Hi Matteo, I am not completely follow the steps. Can you share the exact command to reproduce the issue? What kind of commands did you use to restart the connector? Which version of Kafka are you using? Thanks, Liquan On Wed, May 11, 2016 at 4:40 AM, Matteo Luzzi wrote: > Hi again, I was able

Re: Kafka Connect tasks consumers issue

2016-05-11 Thread Matteo Luzzi
Hi again, I was able to reproduce the bug in the same scenario (two workers on separate machines) just by deleting the connector from the Rest API and then restarting it again. I also got this error on one of the workers : [2016-05-11 11:29:47,034] INFO 172.17.42.1 - - [11/May/2016:11:29:45 +]

Re: Kafka Connect tasks consumers issue

2016-05-11 Thread Matteo Luzzi
Hi Liquan, thanks for the fast response. I'm able to reproduce the error by having two workers running on two different machines. If I restart one of the two worker, the failover logic correctly detects the failure and shut down the tasks on the healthy worker for rebalancing. When the failed worke

Re: Kafka Connect tasks consumers issue

2016-05-11 Thread Liquan Pei
Hi Matteo, Glad to hear that you are building a connector. To better understand the issue, can you provide the exact steps to re-produce the issue? One thing I am confused is that when one worker is shutdown, you don't need to restart the connector through the rest API, the failover logic should h

Kafka Connect tasks consumers issue

2016-05-11 Thread Matteo Luzzi
Hi, I'm working on a custom implementation of a sink connector for Kafka Connect framework. I'm testing the connector for fault tolerance by killing the worker process and restarting the connector through the Rest API and occasionally I notice that some tasks don't receive anymore messages from th