I have seen session expire events mainly when we unplug nw  from the node to ZK 
servers ( mainly i have seen when we are developing failover controller fw with 
ZK). This may not be usual scenario, but it can happen.



Coming to BK case, when we loos the zk handle connectivity, simply replacing 
may not be possible always because, we will not be sure when exactly we can 
create new connection with ZK back.

So, may be the fix could be that BK clients can throw the exception as they can 
not serve when ZK is not availble. Let the application take actions?



Regards,

Uma

________________________________
From: Flavio Junqueira [[email protected]]
Sent: Tuesday, May 01, 2012 6:59 PM
To: [email protected]
Subject: Re: ZooKeeper Session Expiration

I don't know if this is your case, but we have seen in the past with zookeeper 
such issues caused by GC pauses. I remember one case with hbase, and I think it 
is this one:

https://issues.apache.org/jira/browse/HBASE-1316

We have seen zookeeper clusters serving thousands of clients, so ~100 shouldn't 
be a problem. Still session expiration is part of zookeeper, so we need to deal 
with here as well.

-Flavio

On May 1, 2012, at 3:14 PM, John Nagro wrote:

Flavio -

We're trying to get to the bottom of it. As I understand it, in a properly 
configured and operating Zk Cluster we should never see a session expiration 
exception. Globally (including all systems) we see them perhaps once a week for 
the last month - and it causes some issues in our system. We saw one last 
night, and bookkeeper had an issue a couple days ago.

We do have a lot of nodes connecting to zookeeper for various things. We have a 
home-built configuration management tool that uses zk as the data store, the 
bookkeeper stuff obviously does, my coordination on top of the bookkeeper 
ledgers uses it, etc. So yes, lots of machines (dozens up to ~100) talk to this 
zk cluster in some fashion or another - we have other clusters too. Ultimately, 
more machines will talk to the configuration stuff in the long term. I could 
potentially move my zk stuff off that cluster if you think it would help.

-John

On Tue, May 1, 2012 at 8:44 AM, Flavio Junqueira 
<[email protected]<mailto:[email protected]>> wrote:
This is definitely not ideal. If you lose your zookeeper session, then you're 
not able to close your open ledgers, which will force ledger recovery. It is 
not a correctness issue, but certainly inconvenient. We need to fix, and I'm 
glad that Uma is already looking into it.

I'm curious about why you're getting session expirations, though. Is it 
frequent or you got it once? Do you have many nodes connecting to your 
ZooKeeper instance?

-Flavio


On May 1, 2012, at 2:07 PM, John Nagro wrote:

Thanks Uma - that is exactly what i am looking for. The way i am handing it now 
is to pass a bookkeeper client factory rather than an instance. When i 
encounter zk session expiration, i create a new client and discard the old one 
- getting a fresh set of connections to zk. Perhaps not idea, but gets the job 
done.

thanks!

-John

On Tue, May 1, 2012 at 12:09 AM, Uma Maheswara Rao G 
<[email protected]<mailto:[email protected]>> wrote:
Hi John,

 BK client need to handle session expire events from ZK.  Here is the issue for 
that BOOKKEEPER-225<https://issues.apache.org/jira/browse/BOOKKEEPER-225>.
We will implement it soon. I hope this is your doubt. Please correct me if my 
interpretation is wrong about your question here.

Thanks a lot,
Uma
________________________________
From: John Nagro [[email protected]<mailto:[email protected]>]
Sent: Tuesday, May 01, 2012 1:20 AM
To: 
[email protected]<mailto:[email protected]>
Subject: ZooKeeper Session Expiration

Hello -

If I start seeing ZKExceptions in the Bk Client, which appear to be due to 
SessionExpiration errors... it seems that the BookKeeper client never recovers 
from that? Is that correct?

Thanks!

-John Nagro





flavio
junqueira
senior research scientist

[email protected]<mailto:[email protected]>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301

Reply via email to