l#comment-13729537
From: Hargett, Phil
Sent: Friday, August 02, 2013 1:36 PM
To: Jun Rao
Cc: users@kafka.apache.org
Subject: RE: Fatal issue (was RE: 0.8 throwing exception "Failed to find
leader" and high-level consumer fails to make progress_
I
.com]
Sent: Wednesday, July 31, 2013 12:16 AM
To: Hargett, Phil
Cc: users@kafka.apache.org
Subject: Re: Fatal issue (was RE: 0.8 throwing exception "Failed to find
leader" and high-level consumer fails to make progress_
Hmm, that's a good theory. My understanding is that you have one thre
Hmm, that's a good theory. My understanding is that you have one thread
that first shuts down the consumer connector and then creates new streams
on the same connector. Is that right? If so, I don't think the race
condition can happen. When we shutdown the consumer connector, it waits
until the lea
Hmmm...is there a reason that stopConnections in ConsumerFetcherManager does
not grab a lock before shutting down the leaderFinderThread?
I don't see what prevents startConnections/stopConnections from causing a race
in certain conditions and if called on separate threads.
Given there are no lo
Oh, we're building from source multiple times per week, either until 0.8 comes
out of beta or we ourselves slide towards production. :)
Depending on where the builds were done (Dev vs official), we have commits
76d3905 or b1891e7. Both are more recent than beta 1, I believe.
:)
On Jul 30, 201
What's the revision of the 0.8 branch that you used? If that's older than
the beta1 release, I recommend that you upgrade.
Thanks,
Jun
On Tue, Jul 30, 2013 at 3:09 AM, Hargett, Phil <
phil.harg...@mirror-image.com> wrote:
> No, sorry, it didn't take 90 seconds to connect to ZK (at least I hope
No, sorry, it didn't take 90 seconds to connect to ZK (at least I hope not). I
had my consumer open for 90 secs in this case before shutting it down and
disposing of it—hence any races caused by fast startup/shutdown should not have
been relevant.
I build from source off of the 0.8 branch, so i
Hmm, it takes 90 secs to connect to ZK? That seems way too long. Is your ZK
healthy.
Also, are you on the 0.8 beta1 release? If not, could you try that one? It
may not be related, but we did fix some consumer side deadlock issues there.
Thanks,
Jun
On Mon, Jul 29, 2013 at 9:02 AM, Hargett, Phi
Why would a consumer that has been shutdown still be rebalancing?
Zookeeper session timeout (zookeeper.session.timeout.ms) is 1000 and sync time
(zookeeper.sync.timeout.ms) is 500.
Also, the timeout for the thread that looks for the leader is left at the
default 200 milliseconds (refresh.leader
Ok. So, it seems that the issue is there are lots of rebalances in the
consumer. How long did you set the zk session expiration time? A typical
reason for many rebalances is the consumer side GC. If so, you will see
Zookeeper session expirations in the consumer log (grep for Expired).
Occasional re
10 matches
Mail list logo