I don't know if this is your case, but we have seen in the past with zookeeper
such issues caused by GC pauses. I remember one case with hbase, and I think it
is this one:
https://issues.apache.org/jira/browse/HBASE-1316
We have seen zookeeper clusters serving thousands of clients, so ~100 shouldn't
be a problem. Still session expiration is part of zookeeper, so we need to deal
with here as well.
-Flavio
On May 1, 2012, at 3:14 PM, John Nagro wrote:
> Flavio -
>
> We're trying to get to the bottom of it. As I understand it, in a properly
> configured and operating Zk Cluster we should never see a session expiration
> exception. Globally (including all systems) we see them perhaps once a week
> for the last month - and it causes some issues in our system. We saw one last
> night, and bookkeeper had an issue a couple days ago.
>
> We do have a lot of nodes connecting to zookeeper for various things. We have
> a home-built configuration management tool that uses zk as the data store,
> the bookkeeper stuff obviously does, my coordination on top of the bookkeeper
> ledgers uses it, etc. So yes, lots of machines (dozens up to ~100) talk to
> this zk cluster in some fashion or another - we have other clusters too.
> Ultimately, more machines will talk to the configuration stuff in the long
> term. I could potentially move my zk stuff off that cluster if you think it
> would help.
>
> -John
>
> On Tue, May 1, 2012 at 8:44 AM, Flavio Junqueira <[email protected]> wrote:
> This is definitely not ideal. If you lose your zookeeper session, then you're
> not able to close your open ledgers, which will force ledger recovery. It is
> not a correctness issue, but certainly inconvenient. We need to fix, and I'm
> glad that Uma is already looking into it.
>
> I'm curious about why you're getting session expirations, though. Is it
> frequent or you got it once? Do you have many nodes connecting to your
> ZooKeeper instance?
>
> -Flavio
>
>
> On May 1, 2012, at 2:07 PM, John Nagro wrote:
>
>> Thanks Uma - that is exactly what i am looking for. The way i am handing it
>> now is to pass a bookkeeper client factory rather than an instance. When i
>> encounter zk session expiration, i create a new client and discard the old
>> one - getting a fresh set of connections to zk. Perhaps not idea, but gets
>> the job done.
>>
>> thanks!
>>
>> -John
>>
>> On Tue, May 1, 2012 at 12:09 AM, Uma Maheswara Rao G <[email protected]>
>> wrote:
>> Hi John,
>>
>> BK client need to handle session expire events from ZK. Here is the issue
>> for that BOOKKEEPER-225.
>> We will implement it soon. I hope this is your doubt. Please correct me if
>> my interpretation is wrong about your question here.
>>
>> Thanks a lot,
>> Uma
>> From: John Nagro [[email protected]]
>> Sent: Tuesday, May 01, 2012 1:20 AM
>> To: [email protected]
>> Subject: ZooKeeper Session Expiration
>>
>> Hello -
>>
>> If I start seeing ZKExceptions in the Bk Client, which appear to be due to
>> SessionExpiration errors... it seems that the BookKeeper client never
>> recovers from that? Is that correct?
>>
>> Thanks!
>>
>> -John Nagro
>>
>
>
>
flavio
junqueira
senior research scientist
[email protected]
direct +34 93-183-8828
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301