Failing kafka consumer unable to cancel

2015-11-17 Thread Gyula Fóra
Hey guys, I ran into some issue with the kafka consumers. I am reading from more than 50 topics with parallelism 1, and while running the job I got the following exception during the checkpoint notification (offset committing): java.lang.RuntimeException: Error while confirming checkpoint at org

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Ufuk Celebi
https://issues.apache.org/jira/browse/KAFKA-824 This has been fixed for Kafka’s 0.9.0 version. We should investigate why the job gets stuck though. Do you have a stack trace or any logs available? – Ufuk > On 17 Nov 2015, at 09:24, Gyula Fóra wrote: > > Hey guys, > > I ran into some issue w

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Ufuk Celebi
I had a quick chat with Robert and Stephan. The problem is that the StreamTask cancellation needs to acquire a lock, which is held by the Kafka client in an infinite loop. At this point, I’m not sure what our options are here. Maybe Stephan or Robert can chime in… – Ufuk > On 17 Nov 2015, at

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Stephan Ewen
Hey! The problem here is that there is no such thing as proper thread killing in Java (at least it makes everything unstable if you do). Threads need to exit cooperatively. The Kafka Function calls simply are uninterruptibly stuck and never return (pretty bad bug in their Zookeeper Client). As fa

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Stephan Ewen
Another idea: The bug is fixed in the Zookeeper Client that Kafka uses, so if we can bump the transitive dependency, that might fix it... On Tue, Nov 17, 2015 at 11:19 AM, Stephan Ewen wrote: > Hey! > > The problem here is that there is no such thing as proper thread killing > in Java (at least

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Robert Metzger
I would try that approach first On Tue, Nov 17, 2015 at 11:26 AM, Stephan Ewen wrote: > Another idea: The bug is fixed in the Zookeeper Client that Kafka uses, so > if we can bump the transitive dependency, that might fix it... > > On Tue, Nov 17, 2015 at 11:19 AM, Stephan Ewen wrote: > > > Hey

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Gyula Fóra
Should I open a JIRA for this? Gyula Fóra ezt írta (időpont: 2015. nov. 17., K, 11:30): > Thanks for the quick response and thorough explanation :) > > Gyula > > Robert Metzger ezt írta (időpont: 2015. nov. 17., > K, 11:27): > >> I would try that approach first >> >> On Tue, Nov 17, 2015 at 11:

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Gyula Fóra
Thanks for the quick response and thorough explanation :) Gyula Robert Metzger ezt írta (időpont: 2015. nov. 17., K, 11:27): > I would try that approach first > > On Tue, Nov 17, 2015 at 11:26 AM, Stephan Ewen wrote: > > > Another idea: The bug is fixed in the Zookeeper Client that Kafka uses,

Re: Failing kafka consumer unable to cancel

2015-11-17 Thread Stephan Ewen
sure On Tue, Nov 17, 2015 at 11:30 AM, Gyula Fóra wrote: > Should I open a JIRA for this? > > Gyula Fóra ezt írta (időpont: 2015. nov. 17., K, > 11:30): > > > Thanks for the quick response and thorough explanation :) > > > > Gyula > > > > Robert Metzger ezt írta (időpont: 2015. nov. 17., > > K