RE: Heartbeats not being received / responded to?

2015-01-21 Thread Rakesh R
Hi Ian,

I had referred the mail which you sent yesterday and that was showing command 
line execution path.

>>>>Exception in thread "main"

>>>>org.apache.zookeeper.KeeperException$ConnectionLossException:

>>>>KeeperErrorCode = ConnectionLoss for /solr

>>>>at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)

>>>>at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

>>>>at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)

>>>>at

>>>>org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)

>>>>at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)

>>>>at


>>>>>> Unless I am mistaken, that would only solve the problem when connecting 
>>>>>> via zkCli, right?.
Yes, the issue which I pointed is for zkCli.sh.

As per the shared logs, I could see session expiration 0x14b0dbf622c0007. This 
gives the impression that communication failures between client and server.
I’m not good in Go. Do you have client logs ?

2015-01-21 13:47:29,407 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when 
processing sessionid:0x14b0dbf622c0007 type:create cxid:0x1 zxid:0x1657 
txntype:-1 reqpath:n/a Error Path:/solr Error:KeeperErrorCode = NodeExists for 
/solr
2015-01-21 13:47:29,786 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x14b0dbf622c0007, likely client has closed socket
at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:722)
.
.
2015-01-21 13:48:00,004 [myid:] - INFO  [SessionTracker:ZooKeeperServer@347] - 
Expiring session 0x14b0dbf622c0007, timeout of 3ms exceeded


Regards,
Rakesh

From: Ian Rose [mailto:ianr...@fullstory.com]
Sent: 22 January 2015 02:45
To: Rakesh Radhakrishnan
Cc: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>
Subject: Re: Heartbeats not being received / responded to?

Camille -

We are running version 3.4.6 on Mac OSX.  Mostly Yosemite (10.10).  I've 
attached the logs from 2 different sessions where I was working on my 
coworker's machine this afternoon.  At the end of session 2 is when things 
started to succeed (I assume those two "Established session XXX with negotiated 
timeout 15000 for client" are successfully connections).  Unfortunately I do 
not have exact timings recorded for our connection attempts (to correlate with 
timestamps in the zk logs) but I'll try to do that next time we experience 
these problems.


Rakesh -

Unless I am mistaken, that would only solve the problem when connecting via 
zkCli, right?.  We experience the same issue when connecting from python (as in 
my original email on this thread) or from Go.  It is noteworthy that in our Go 
code (below) we wait explicitly for a connection event before attempting any 
operations yet sometimes still have problems with those operations failing.

- Ian


(if you can read Go, here is our wait-until-connected code; for the client we 
are using 
github.com/samuel/go-zookeeper/zk<http://github.com/samuel/go-zookeeper/zk>)

if ConnectTimeout > 0 {
c := time.After(ConnectTimeout)
for {
select {
case <-c:
conn.Close()
return nil, ErrTimeout
case evt := <-events:
if evt.State == 
zk.StateHasSession {
return conn, nil
}
}
}
}




On Wed, Jan 21, 2015 at 2:43 PM, Rakesh Radhakrishnan 
mailto:rakeshr.apa...@gmail.com>> wrote:
Hi,

>>>>>>>Frustratingly, this behavior is very inconsistent.  In fact, after 5+
failures in a row on our UX lead's machine, now I just heard from him that
he tried again and it worked.  Could it be that there is some kind of state
on Zookeeper that needs to timeout and disappear before things will work?


Meantime, could you see this jira  
https://issues.apache.org/jira/browse/ZOOKEEPER-1871
I feel, this is similar line. If yes, can you try the patch and see it passing 
consistently ?


-Rakesh



On Thu, Jan 22, 2015 at 12:55 AM, Camille Fournier 
mailto:cami...@apache.org>> wrote:
Do you have th

Re: Heartbeats not being received / responded to?

2015-01-21 Thread Ian Rose
Camille -

We are running version 3.4.6 on Mac OSX.  Mostly Yosemite (10.10).  I've
attached the logs from 2 different sessions where I was working on my
coworker's machine this afternoon.  At the end of session 2 is when things
started to succeed (I assume those two "Established session XXX with
negotiated timeout 15000 for client" are successfully connections).
Unfortunately I do not have exact timings recorded for our connection
attempts (to correlate with timestamps in the zk logs) but I'll try to do
that next time we experience these problems.


Rakesh -

Unless I am mistaken, that would only solve the problem when connecting via
zkCli, right?.  We experience the same issue when connecting from python
(as in my original email on this thread) or from Go.  It is noteworthy that
in our Go code (below) we wait explicitly for a connection event before
attempting any operations yet sometimes still have problems with those
operations failing.

- Ian


(if you can read Go, here is our wait-until-connected code; for the client
we are using github.com/samuel/go-zookeeper/zk)

if ConnectTimeout > 0 {
c := time.After(ConnectTimeout)
for {
select {
case <-c:
conn.Close()
return nil, ErrTimeout
case evt := <-events:
if evt.State == zk.StateHasSession {
return conn, nil
}
}
}
}




On Wed, Jan 21, 2015 at 2:43 PM, Rakesh Radhakrishnan <
rakeshr.apa...@gmail.com> wrote:

> Hi,
>
> >>>Frustratingly, this behavior is very inconsistent.  In fact, after
> 5+
> failures in a row on our UX lead's machine, now I just heard from him that
> he tried again and it worked.  Could it be that there is some kind of state
> on Zookeeper that needs to timeout and disappear before things will work?
>
>
> Meantime, could you see this jira
> https://issues.apache.org/jira/browse/ZOOKEEPER-1871
> I feel, this is similar line. If yes, can you try the patch and see it
> passing consistently ?
>
>
> -Rakesh
>
>
>
> On Thu, Jan 22, 2015 at 12:55 AM, Camille Fournier 
> wrote:
>
>> Do you have the logs from the zk server? What type of machines are you
>> running this on? Which version of ZK?
>>
>> Thanks,
>> C
>>
>> On Wed, Jan 21, 2015 at 2:00 PM, Ian Rose  wrote:
>>
>> > Hi all -
>> >
>> > Reviving an old thread here.  We have now been running Zookeeper for a
>> few
>> > months now and this problem has continued to dog us.  This is entirely a
>> > localhost problem (when zookeeper and the client app are both running
>> on a
>> > developer laptop) - thankfully everything has been solid on our
>> production
>> > servers.  Nonetheless this causes major headaches in our development
>> > process.  For example, I just now debugged a problem our UX lead was
>> having
>> > getting our dev stack up and running on his local machine; the problem
>> was
>> > that our startup script was failing when using zkCli.sh to create a
>> "/solr"
>> > node in ZK.  This would not be a problem if it was a rare/spurious
>> error,
>> > but he is getting this error every single time.
>> >
>> > For reference, here is the complete stack trace (which appears after
>> > several seconds - the failure is not immediate):
>> >
>> > Exception in thread "main"
>> > org.apache.zookeeper.KeeperException$ConnectionLossException:
>> > KeeperErrorCode = ConnectionLoss for /solr
>> > at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> > at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>> > at
>> > org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)
>> > at
>> > org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
>> > at
>> > org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
>> > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>> > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>> >
>> > Frustratingly, this behavior is very inconsistent.  In fact, after 5+
>> > failures in a row on our UX lead's machine, now I just heard from him
>> that
>> > he tried again and it worked.  Could it be that there is some kind of
>> state
>> > on Zookeeper that needs to timeout and disappear before things will
>> work?
>> >
>> > More concretely, the next time this happens, what kind of info should I
>> > capture to make this more debuggable?
>> >
>> > Many thanks,
>> > Ian
>> >
>> >
>> >
>> > On Mon, Sep 29, 2014 at 4:51 PM, Ian Rose 
>> wrote:
>> >
>> > > Well unfortunately although I am able to repro this problem with a
>> > > completely new, clean ZK install on my laptop, I *cannot* repro with
>> the
>> > > same on my coworkers laptop.  So unfortunately I am forced to conclude
>> > that
>> > > there is something strange going on locally.
>> > >
>> > > Thanks for the help anyhow!
>> > >
>> > > - Ian
>> > >
>> > >
>> > > On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier > >
>> > > wrote:
>> > >
>> > >> No that is not expected. Odd that you get disconnected

Re: Heartbeats not being received / responded to?

2015-01-21 Thread Rakesh Radhakrishnan
Hi,

>>>Frustratingly, this behavior is very inconsistent.  In fact, after 5+
failures in a row on our UX lead's machine, now I just heard from him that
he tried again and it worked.  Could it be that there is some kind of state
on Zookeeper that needs to timeout and disappear before things will work?


Meantime, could you see this jira
https://issues.apache.org/jira/browse/ZOOKEEPER-1871
I feel, this is similar line. If yes, can you try the patch and see it
passing consistently ?


-Rakesh



On Thu, Jan 22, 2015 at 12:55 AM, Camille Fournier 
wrote:

> Do you have the logs from the zk server? What type of machines are you
> running this on? Which version of ZK?
>
> Thanks,
> C
>
> On Wed, Jan 21, 2015 at 2:00 PM, Ian Rose  wrote:
>
> > Hi all -
> >
> > Reviving an old thread here.  We have now been running Zookeeper for a
> few
> > months now and this problem has continued to dog us.  This is entirely a
> > localhost problem (when zookeeper and the client app are both running on
> a
> > developer laptop) - thankfully everything has been solid on our
> production
> > servers.  Nonetheless this causes major headaches in our development
> > process.  For example, I just now debugged a problem our UX lead was
> having
> > getting our dev stack up and running on his local machine; the problem
> was
> > that our startup script was failing when using zkCli.sh to create a
> "/solr"
> > node in ZK.  This would not be a problem if it was a rare/spurious error,
> > but he is getting this error every single time.
> >
> > For reference, here is the complete stack trace (which appears after
> > several seconds - the failure is not immediate):
> >
> > Exception in thread "main"
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /solr
> > at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> > at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> > at
> > org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)
> > at
> > org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> > at
> > org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
> >
> > Frustratingly, this behavior is very inconsistent.  In fact, after 5+
> > failures in a row on our UX lead's machine, now I just heard from him
> that
> > he tried again and it worked.  Could it be that there is some kind of
> state
> > on Zookeeper that needs to timeout and disappear before things will work?
> >
> > More concretely, the next time this happens, what kind of info should I
> > capture to make this more debuggable?
> >
> > Many thanks,
> > Ian
> >
> >
> >
> > On Mon, Sep 29, 2014 at 4:51 PM, Ian Rose  wrote:
> >
> > > Well unfortunately although I am able to repro this problem with a
> > > completely new, clean ZK install on my laptop, I *cannot* repro with
> the
> > > same on my coworkers laptop.  So unfortunately I am forced to conclude
> > that
> > > there is something strange going on locally.
> > >
> > > Thanks for the help anyhow!
> > >
> > > - Ian
> > >
> > >
> > > On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier 
> > > wrote:
> > >
> > >> No that is not expected. Odd that you get disconnected once and then
> > >> reconnect fine. Does the same thing happen in your kazoo clients, one
> > >> disconnect but then the second connect is ok?
> > >> Which version of ZK are you running? Are you running this with some
> sort
> > >> of
> > >> auth or password to the zk server?
> > >>
> > >> Thanks,
> > >> C
> > >>
> > >> On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose 
> > wrote:
> > >>
> > >> > Perhaps I'm simply misunderstanding what the expected behavior would
> > be.
> > >> > Why would my client by disconnected?  Does zookeeper drop idle
> > clients?
> > >> > And note that this isn't a spurious disconnect; my client is
> *always*
> > >> > dropped
> > >> > at that time.
> > >> >
> > >> > On Friday zkCli seemed to be working just fine for me, but now I am
> > >> getting
> > >> > disconnections similar to my kazoo-based client.
> > >> >
> > >> > Here is a session showing `ls /' failing.  This behavior is
> > reproducible
> > >> > for me currently.
> > >> >
> > >> > $ ./bin/zkCli.sh
> > >> > > Connecting to localhost:2181
> > >> > > Welcome to ZooKeeper!
> > >> > > JLine support is enabled
> > >> > > [zk: localhost:2181(CONNECTING) 0]
> > >> > > WATCHER::
> > >> > > WatchedEvent state:SyncConnected type:None path:null
> > >> > > [zk: localhost:2181(CONNECTED) 0]
> > >> > > [zk: localhost:2181(CONNECTED) 0] ls /
> > >> > > WATCHER::
> > >> > > WatchedEvent state:Disconnected type:None path:null
> > >> > > Exception in thread "main"
> > >> > > org.apache.zookeeper.KeeperException$ConnectionLo

Re: Heartbeats not being received / responded to?

2015-01-21 Thread Camille Fournier
Do you have the logs from the zk server? What type of machines are you
running this on? Which version of ZK?

Thanks,
C

On Wed, Jan 21, 2015 at 2:00 PM, Ian Rose  wrote:

> Hi all -
>
> Reviving an old thread here.  We have now been running Zookeeper for a few
> months now and this problem has continued to dog us.  This is entirely a
> localhost problem (when zookeeper and the client app are both running on a
> developer laptop) - thankfully everything has been solid on our production
> servers.  Nonetheless this causes major headaches in our development
> process.  For example, I just now debugged a problem our UX lead was having
> getting our dev stack up and running on his local machine; the problem was
> that our startup script was failing when using zkCli.sh to create a "/solr"
> node in ZK.  This would not be a problem if it was a rare/spurious error,
> but he is getting this error every single time.
>
> For reference, here is the complete stack trace (which appears after
> several seconds - the failure is not immediate):
>
> Exception in thread "main"
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /solr
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> at
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)
> at
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> at
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>
> Frustratingly, this behavior is very inconsistent.  In fact, after 5+
> failures in a row on our UX lead's machine, now I just heard from him that
> he tried again and it worked.  Could it be that there is some kind of state
> on Zookeeper that needs to timeout and disappear before things will work?
>
> More concretely, the next time this happens, what kind of info should I
> capture to make this more debuggable?
>
> Many thanks,
> Ian
>
>
>
> On Mon, Sep 29, 2014 at 4:51 PM, Ian Rose  wrote:
>
> > Well unfortunately although I am able to repro this problem with a
> > completely new, clean ZK install on my laptop, I *cannot* repro with the
> > same on my coworkers laptop.  So unfortunately I am forced to conclude
> that
> > there is something strange going on locally.
> >
> > Thanks for the help anyhow!
> >
> > - Ian
> >
> >
> > On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier 
> > wrote:
> >
> >> No that is not expected. Odd that you get disconnected once and then
> >> reconnect fine. Does the same thing happen in your kazoo clients, one
> >> disconnect but then the second connect is ok?
> >> Which version of ZK are you running? Are you running this with some sort
> >> of
> >> auth or password to the zk server?
> >>
> >> Thanks,
> >> C
> >>
> >> On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose 
> wrote:
> >>
> >> > Perhaps I'm simply misunderstanding what the expected behavior would
> be.
> >> > Why would my client by disconnected?  Does zookeeper drop idle
> clients?
> >> > And note that this isn't a spurious disconnect; my client is *always*
> >> > dropped
> >> > at that time.
> >> >
> >> > On Friday zkCli seemed to be working just fine for me, but now I am
> >> getting
> >> > disconnections similar to my kazoo-based client.
> >> >
> >> > Here is a session showing `ls /' failing.  This behavior is
> reproducible
> >> > for me currently.
> >> >
> >> > $ ./bin/zkCli.sh
> >> > > Connecting to localhost:2181
> >> > > Welcome to ZooKeeper!
> >> > > JLine support is enabled
> >> > > [zk: localhost:2181(CONNECTING) 0]
> >> > > WATCHER::
> >> > > WatchedEvent state:SyncConnected type:None path:null
> >> > > [zk: localhost:2181(CONNECTED) 0]
> >> > > [zk: localhost:2181(CONNECTED) 0] ls /
> >> > > WATCHER::
> >> > > WatchedEvent state:Disconnected type:None path:null
> >> > > Exception in thread "main"
> >> > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> >> > > KeeperErrorCode = ConnectionLoss for /
> >> > > at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> > > at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> >> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
> >> > > at
> >> >
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:720)
> >> > > at
> >> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> >> > > at
> >> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> >> > > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> >> > > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
> 

Re: Heartbeats not being received / responded to?

2015-01-21 Thread Ian Rose
Hi all -

Reviving an old thread here.  We have now been running Zookeeper for a few
months now and this problem has continued to dog us.  This is entirely a
localhost problem (when zookeeper and the client app are both running on a
developer laptop) - thankfully everything has been solid on our production
servers.  Nonetheless this causes major headaches in our development
process.  For example, I just now debugged a problem our UX lead was having
getting our dev stack up and running on his local machine; the problem was
that our startup script was failing when using zkCli.sh to create a "/solr"
node in ZK.  This would not be a problem if it was a rare/spurious error,
but he is getting this error every single time.

For reference, here is the complete stack trace (which appears after
several seconds - the failure is not immediate):

Exception in thread "main"
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /solr
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
at
org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)

Frustratingly, this behavior is very inconsistent.  In fact, after 5+
failures in a row on our UX lead's machine, now I just heard from him that
he tried again and it worked.  Could it be that there is some kind of state
on Zookeeper that needs to timeout and disappear before things will work?

More concretely, the next time this happens, what kind of info should I
capture to make this more debuggable?

Many thanks,
Ian



On Mon, Sep 29, 2014 at 4:51 PM, Ian Rose  wrote:

> Well unfortunately although I am able to repro this problem with a
> completely new, clean ZK install on my laptop, I *cannot* repro with the
> same on my coworkers laptop.  So unfortunately I am forced to conclude that
> there is something strange going on locally.
>
> Thanks for the help anyhow!
>
> - Ian
>
>
> On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier 
> wrote:
>
>> No that is not expected. Odd that you get disconnected once and then
>> reconnect fine. Does the same thing happen in your kazoo clients, one
>> disconnect but then the second connect is ok?
>> Which version of ZK are you running? Are you running this with some sort
>> of
>> auth or password to the zk server?
>>
>> Thanks,
>> C
>>
>> On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose  wrote:
>>
>> > Perhaps I'm simply misunderstanding what the expected behavior would be.
>> > Why would my client by disconnected?  Does zookeeper drop idle clients?
>> > And note that this isn't a spurious disconnect; my client is *always*
>> > dropped
>> > at that time.
>> >
>> > On Friday zkCli seemed to be working just fine for me, but now I am
>> getting
>> > disconnections similar to my kazoo-based client.
>> >
>> > Here is a session showing `ls /' failing.  This behavior is reproducible
>> > for me currently.
>> >
>> > $ ./bin/zkCli.sh
>> > > Connecting to localhost:2181
>> > > Welcome to ZooKeeper!
>> > > JLine support is enabled
>> > > [zk: localhost:2181(CONNECTING) 0]
>> > > WATCHER::
>> > > WatchedEvent state:SyncConnected type:None path:null
>> > > [zk: localhost:2181(CONNECTED) 0]
>> > > [zk: localhost:2181(CONNECTED) 0] ls /
>> > > WATCHER::
>> > > WatchedEvent state:Disconnected type:None path:null
>> > > Exception in thread "main"
>> > > org.apache.zookeeper.KeeperException$ConnectionLossException:
>> > > KeeperErrorCode = ConnectionLoss for /
>> > > at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> > > at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
>> > > at
>> > org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:720)
>> > > at
>> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
>> > > at
>> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
>> > > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>> > > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>> >
>> >
>> > And here is a session where after connecting I just sit and wait for a
>> bit
>> > (without sending any commands), whereupon I get disconnected and then
>> > immediately reconnected.  Interestingly, after the reconnecting I am
>> able
>> > to issue multiple commands without any issues.  This behavior is
>> > reproducible for me currently.
>> >
>> > $ ./bin/zkCli.sh
>> > > Connecting

Re: Heartbeats not being received / responded to?

2014-09-29 Thread Ian Rose
Well unfortunately although I am able to repro this problem with a
completely new, clean ZK install on my laptop, I *cannot* repro with the
same on my coworkers laptop.  So unfortunately I am forced to conclude that
there is something strange going on locally.

Thanks for the help anyhow!

- Ian


On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier 
wrote:

> No that is not expected. Odd that you get disconnected once and then
> reconnect fine. Does the same thing happen in your kazoo clients, one
> disconnect but then the second connect is ok?
> Which version of ZK are you running? Are you running this with some sort of
> auth or password to the zk server?
>
> Thanks,
> C
>
> On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose  wrote:
>
> > Perhaps I'm simply misunderstanding what the expected behavior would be.
> > Why would my client by disconnected?  Does zookeeper drop idle clients?
> > And note that this isn't a spurious disconnect; my client is *always*
> > dropped
> > at that time.
> >
> > On Friday zkCli seemed to be working just fine for me, but now I am
> getting
> > disconnections similar to my kazoo-based client.
> >
> > Here is a session showing `ls /' failing.  This behavior is reproducible
> > for me currently.
> >
> > $ ./bin/zkCli.sh
> > > Connecting to localhost:2181
> > > Welcome to ZooKeeper!
> > > JLine support is enabled
> > > [zk: localhost:2181(CONNECTING) 0]
> > > WATCHER::
> > > WatchedEvent state:SyncConnected type:None path:null
> > > [zk: localhost:2181(CONNECTED) 0]
> > > [zk: localhost:2181(CONNECTED) 0] ls /
> > > WATCHER::
> > > WatchedEvent state:Disconnected type:None path:null
> > > Exception in thread "main"
> > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > KeeperErrorCode = ConnectionLoss for /
> > > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> > > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
> > > at
> > org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:720)
> > > at
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> > > at
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> > > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> > > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
> >
> >
> > And here is a session where after connecting I just sit and wait for a
> bit
> > (without sending any commands), whereupon I get disconnected and then
> > immediately reconnected.  Interestingly, after the reconnecting I am able
> > to issue multiple commands without any issues.  This behavior is
> > reproducible for me currently.
> >
> > $ ./bin/zkCli.sh
> > > Connecting to localhost:2181
> > > Welcome to ZooKeeper!
> > > JLine support is enabled
> > > [zk: localhost:2181(CONNECTING) 0]
> > > [zk: localhost:2181(CONNECTING) 0]
> > > WATCHER::
> > > WatchedEvent state:SyncConnected type:None path:null
> > > [zk: localhost:2181(CONNECTED) 0]
> > > WATCHER::
> > > WatchedEvent state:Disconnected type:None path:null
> > > WATCHER::
> > > WatchedEvent state:SyncConnected type:None path:null
> > > [zk: localhost:2181(CONNECTED) 0]
> > > [zk: localhost:2181(CONNECTED) 0] ls /
> > > [zookeeper]
> > > [zk: localhost:2181(CONNECTED) 1]
> > > [zk: localhost:2181(CONNECTED) 1]
> > > [zk: localhost:2181(CONNECTED) 1] ls /zookeeper
> > > [quota]
> > > [zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota
> > > []
> >
> >
> >
> > Thanks,
> > - Ian
> >
> >
> > On Sun, Sep 28, 2014 at 2:22 PM, Camille Fournier 
> > wrote:
> >
> > > Sorry but with what you've sent us I don't really see what the problem
> > is.
> > > It does look like you connect and then nothing happens for 20s and then
> > the
> > > connection is dropped. If you use the zkCli script to connect via the
> > > command line do you see the same problem?
> > >
> > > C
> > >
> > > On Fri, Sep 26, 2014 at 12:48 PM, Ian Rose 
> > wrote:
> > >
> > > > Hi all -
> > > >
> > > > I've just gotten started using SolrCloud, which uses Zookeeper for
> > > > coordination; I am otherwise completely new to Zookeeper.  Now I am
> > > trying
> > > > to query Zookeeper directly for some simple information.  I am
> finding,
> > > > however, that although my clients are able to connect they very
> > > frequently
> > > > receive timeouts.  It almost seems like the server isn't receiving
> the
> > > > heartbeat messages at all (or isn't responding to them).  I've seen
> > > similar
> > > > behavior both when using the go-zookeeper
> > > >  and kazoo
> > > >  (python) client libraries (I
> > wanted
> > > > to
> > > > try >1 to ensure that it wasn't a client lib problem).
> > > >
> > > > My config is very simple: I am running the client and a single
> > Zookeeper

Re: Heartbeats not being received / responded to?

2014-09-29 Thread Camille Fournier
No that is not expected. Odd that you get disconnected once and then
reconnect fine. Does the same thing happen in your kazoo clients, one
disconnect but then the second connect is ok?
Which version of ZK are you running? Are you running this with some sort of
auth or password to the zk server?

Thanks,
C

On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose  wrote:

> Perhaps I'm simply misunderstanding what the expected behavior would be.
> Why would my client by disconnected?  Does zookeeper drop idle clients?
> And note that this isn't a spurious disconnect; my client is *always*
> dropped
> at that time.
>
> On Friday zkCli seemed to be working just fine for me, but now I am getting
> disconnections similar to my kazoo-based client.
>
> Here is a session showing `ls /' failing.  This behavior is reproducible
> for me currently.
>
> $ ./bin/zkCli.sh
> > Connecting to localhost:2181
> > Welcome to ZooKeeper!
> > JLine support is enabled
> > [zk: localhost:2181(CONNECTING) 0]
> > WATCHER::
> > WatchedEvent state:SyncConnected type:None path:null
> > [zk: localhost:2181(CONNECTED) 0]
> > [zk: localhost:2181(CONNECTED) 0] ls /
> > WATCHER::
> > WatchedEvent state:Disconnected type:None path:null
> > Exception in thread "main"
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
> > at
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:720)
> > at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> > at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> > at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> > at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>
>
> And here is a session where after connecting I just sit and wait for a bit
> (without sending any commands), whereupon I get disconnected and then
> immediately reconnected.  Interestingly, after the reconnecting I am able
> to issue multiple commands without any issues.  This behavior is
> reproducible for me currently.
>
> $ ./bin/zkCli.sh
> > Connecting to localhost:2181
> > Welcome to ZooKeeper!
> > JLine support is enabled
> > [zk: localhost:2181(CONNECTING) 0]
> > [zk: localhost:2181(CONNECTING) 0]
> > WATCHER::
> > WatchedEvent state:SyncConnected type:None path:null
> > [zk: localhost:2181(CONNECTED) 0]
> > WATCHER::
> > WatchedEvent state:Disconnected type:None path:null
> > WATCHER::
> > WatchedEvent state:SyncConnected type:None path:null
> > [zk: localhost:2181(CONNECTED) 0]
> > [zk: localhost:2181(CONNECTED) 0] ls /
> > [zookeeper]
> > [zk: localhost:2181(CONNECTED) 1]
> > [zk: localhost:2181(CONNECTED) 1]
> > [zk: localhost:2181(CONNECTED) 1] ls /zookeeper
> > [quota]
> > [zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota
> > []
>
>
>
> Thanks,
> - Ian
>
>
> On Sun, Sep 28, 2014 at 2:22 PM, Camille Fournier 
> wrote:
>
> > Sorry but with what you've sent us I don't really see what the problem
> is.
> > It does look like you connect and then nothing happens for 20s and then
> the
> > connection is dropped. If you use the zkCli script to connect via the
> > command line do you see the same problem?
> >
> > C
> >
> > On Fri, Sep 26, 2014 at 12:48 PM, Ian Rose 
> wrote:
> >
> > > Hi all -
> > >
> > > I've just gotten started using SolrCloud, which uses Zookeeper for
> > > coordination; I am otherwise completely new to Zookeeper.  Now I am
> > trying
> > > to query Zookeeper directly for some simple information.  I am finding,
> > > however, that although my clients are able to connect they very
> > frequently
> > > receive timeouts.  It almost seems like the server isn't receiving the
> > > heartbeat messages at all (or isn't responding to them).  I've seen
> > similar
> > > behavior both when using the go-zookeeper
> > >  and kazoo
> > >  (python) client libraries (I
> wanted
> > > to
> > > try >1 to ensure that it wasn't a client lib problem).
> > >
> > > My config is very simple: I am running the client and a single
> Zookeeper
> > > node on the same machine (my laptop).  There are no other clients of
> the
> > > Zookeeper node while I am running these tests, so there is no practical
> > > possibility that the JVM is overloaded or GCing.
> > >
> > > Here is a very basic kazoo client that I am using.  Obviously it isn't
> > > doing any "real" work right now - this is just to demonstrate the
> > > disconnects.
> > >
> > > #!/usr/bin/env python2.7
> > >
> > > from kazoo.client import KazooClient, KazooState
> > > import logging
> > > import time
> > >
> > > def my_listener(state):
> > >   if state == KazooState.LOST:
> > >  

Re: Heartbeats not being received / responded to?

2014-09-29 Thread Ian Rose
Perhaps I'm simply misunderstanding what the expected behavior would be.
Why would my client by disconnected?  Does zookeeper drop idle clients?
And note that this isn't a spurious disconnect; my client is *always* dropped
at that time.

On Friday zkCli seemed to be working just fine for me, but now I am getting
disconnections similar to my kazoo-based client.

Here is a session showing `ls /' failing.  This behavior is reproducible
for me currently.

$ ./bin/zkCli.sh
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: localhost:2181(CONNECTING) 0]
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> [zk: localhost:2181(CONNECTED) 0]
> [zk: localhost:2181(CONNECTED) 0] ls /
> WATCHER::
> WatchedEvent state:Disconnected type:None path:null
> Exception in thread "main"
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
> at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:720)
> at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
> at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
> at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
> at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)


And here is a session where after connecting I just sit and wait for a bit
(without sending any commands), whereupon I get disconnected and then
immediately reconnected.  Interestingly, after the reconnecting I am able
to issue multiple commands without any issues.  This behavior is
reproducible for me currently.

$ ./bin/zkCli.sh
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: localhost:2181(CONNECTING) 0]
> [zk: localhost:2181(CONNECTING) 0]
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> [zk: localhost:2181(CONNECTED) 0]
> WATCHER::
> WatchedEvent state:Disconnected type:None path:null
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> [zk: localhost:2181(CONNECTED) 0]
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1]
> [zk: localhost:2181(CONNECTED) 1]
> [zk: localhost:2181(CONNECTED) 1] ls /zookeeper
> [quota]
> [zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota
> []



Thanks,
- Ian


On Sun, Sep 28, 2014 at 2:22 PM, Camille Fournier 
wrote:

> Sorry but with what you've sent us I don't really see what the problem is.
> It does look like you connect and then nothing happens for 20s and then the
> connection is dropped. If you use the zkCli script to connect via the
> command line do you see the same problem?
>
> C
>
> On Fri, Sep 26, 2014 at 12:48 PM, Ian Rose  wrote:
>
> > Hi all -
> >
> > I've just gotten started using SolrCloud, which uses Zookeeper for
> > coordination; I am otherwise completely new to Zookeeper.  Now I am
> trying
> > to query Zookeeper directly for some simple information.  I am finding,
> > however, that although my clients are able to connect they very
> frequently
> > receive timeouts.  It almost seems like the server isn't receiving the
> > heartbeat messages at all (or isn't responding to them).  I've seen
> similar
> > behavior both when using the go-zookeeper
> >  and kazoo
> >  (python) client libraries (I wanted
> > to
> > try >1 to ensure that it wasn't a client lib problem).
> >
> > My config is very simple: I am running the client and a single Zookeeper
> > node on the same machine (my laptop).  There are no other clients of the
> > Zookeeper node while I am running these tests, so there is no practical
> > possibility that the JVM is overloaded or GCing.
> >
> > Here is a very basic kazoo client that I am using.  Obviously it isn't
> > doing any "real" work right now - this is just to demonstrate the
> > disconnects.
> >
> > #!/usr/bin/env python2.7
> >
> > from kazoo.client import KazooClient, KazooState
> > import logging
> > import time
> >
> > def my_listener(state):
> >   if state == KazooState.LOST:
> > # Register somewhere that the session was lost
> > logging.warning('handle lost')
> >   elif state == KazooState.SUSPENDED:
> > # Handle being disconnected from Zookeeper
> > logging.debug('handle being disconnected')
> >   else:
> > # Handle being connected/reconnected to Zookeeper
> > logging.debug('handle being (re)connected')
> >
> >
> > if __name__ == '__main__':
> >   logging.basicConfig(format='%(asctime)-15s %(levelname)s %(message)s',
> > level=logging.DEBUG)
> >
> >   logging.debug('starting...')
> >   zk = KazooClient(hosts='127.0.0.1:2181', timeout=30)
> >  

Re: Heartbeats not being received / responded to?

2014-09-28 Thread Camille Fournier
Sorry but with what you've sent us I don't really see what the problem is.
It does look like you connect and then nothing happens for 20s and then the
connection is dropped. If you use the zkCli script to connect via the
command line do you see the same problem?

C

On Fri, Sep 26, 2014 at 12:48 PM, Ian Rose  wrote:

> Hi all -
>
> I've just gotten started using SolrCloud, which uses Zookeeper for
> coordination; I am otherwise completely new to Zookeeper.  Now I am trying
> to query Zookeeper directly for some simple information.  I am finding,
> however, that although my clients are able to connect they very frequently
> receive timeouts.  It almost seems like the server isn't receiving the
> heartbeat messages at all (or isn't responding to them).  I've seen similar
> behavior both when using the go-zookeeper
>  and kazoo
>  (python) client libraries (I wanted
> to
> try >1 to ensure that it wasn't a client lib problem).
>
> My config is very simple: I am running the client and a single Zookeeper
> node on the same machine (my laptop).  There are no other clients of the
> Zookeeper node while I am running these tests, so there is no practical
> possibility that the JVM is overloaded or GCing.
>
> Here is a very basic kazoo client that I am using.  Obviously it isn't
> doing any "real" work right now - this is just to demonstrate the
> disconnects.
>
> #!/usr/bin/env python2.7
>
> from kazoo.client import KazooClient, KazooState
> import logging
> import time
>
> def my_listener(state):
>   if state == KazooState.LOST:
> # Register somewhere that the session was lost
> logging.warning('handle lost')
>   elif state == KazooState.SUSPENDED:
> # Handle being disconnected from Zookeeper
> logging.debug('handle being disconnected')
>   else:
> # Handle being connected/reconnected to Zookeeper
> logging.debug('handle being (re)connected')
>
>
> if __name__ == '__main__':
>   logging.basicConfig(format='%(asctime)-15s %(levelname)s %(message)s',
> level=logging.DEBUG)
>
>   logging.debug('starting...')
>   zk = KazooClient(hosts='127.0.0.1:2181', timeout=30)
>   zk.start()
>   zk.add_listener(my_listener)
>
>   time.sleep(35)
>
>   zk.stop()
>
> 
>
> Here is my zookeeper config:
>
> # The number of milliseconds of each tick
> tickTime=2000
>
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
>
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
>
> # the directory where the snapshot is stored.
> dataDir=/Users/ianrose/Code/zookeeper/var/data
>
> # the port at which the clients will connect
> clientPort=2181
>
> # The number of snapshots to retain in dataDir
> autopurge.snapRetainCount=5
>
> # Purge task interval in hours
> # Set to "0" to disable auto purge feature
> autopurge.purgeInterval=1
>
>
> 
>
>
> Here is the output I get on the client:
>
> 2014-09-26 12:43:20,603 DEBUG starting...
> 2014-09-26 12:43:20,604 INFO Connecting to 127.0.0.1:2181
> 2014-09-26 12:43:20,605 DEBUG Sending request(xid=None):
> Connect(protocol_version=0, last_zxid_seen=0, time_out=3, session_id=0,
> passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
> read_only=None)
> 2014-09-26 12:43:20,609 INFO Zookeeper connection established, state:
> CONNECTED
> 2014-09-26 12:43:40,222 WARNING Connection dropped: outstanding heartbeat
> ping not received
> 2014-09-26 12:43:40,222 WARNING Transition to CONNECTING
> 2014-09-26 12:43:40,222 INFO Zookeeper connection lost
> 2014-09-26 12:43:40,222 DEBUG handle being disconnected
> 2014-09-26 12:43:40,747 INFO Connecting to 127.0.0.1:2181
> 2014-09-26 12:43:40,748 DEBUG Sending request(xid=None):
> Connect(protocol_version=0, last_zxid_seen=0, time_out=3,
> session_id=92520388231233540,
> passwd='\xd1M\xb9\xb3\xae\xab\xa1!@x\x06nv\xb7\xe3*', read_only=None)
> 2014-09-26 12:43:40,750 INFO Zookeeper connection established, state:
> CONNECTED
> 2014-09-26 12:43:40,751 DEBUG handle being (re)connected
> 2014-09-26 12:43:55,611 DEBUG Sending request(xid=1): Close()
> 2014-09-26 12:43:55,614 INFO Closing connection to 127.0.0.1:2181
> 2014-09-26 12:43:55,615 INFO Zookeeper session lost, state: CLOSED
> 2014-09-26 12:43:55,615 WARNING handle lost
>
>
> And here is trace-level logging from the server side:
>
> 2014-09-26 12:43:20,605 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197][] - Accepted socket
> connection from /127.0.0.1:58959
> 2014-09-26 12:43:20,605 [myid:] - DEBUG [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@810][] - Session establishment
> request
> from client /127.0.0.1:58959 client's lastZxid is 0x0
> 2014-09-26 12:43:20,605 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868][] - Client attempting to
> establish new sess

Heartbeats not being received / responded to?

2014-09-26 Thread Ian Rose
Hi all -

I've just gotten started using SolrCloud, which uses Zookeeper for
coordination; I am otherwise completely new to Zookeeper.  Now I am trying
to query Zookeeper directly for some simple information.  I am finding,
however, that although my clients are able to connect they very frequently
receive timeouts.  It almost seems like the server isn't receiving the
heartbeat messages at all (or isn't responding to them).  I've seen similar
behavior both when using the go-zookeeper
 and kazoo
 (python) client libraries (I wanted to
try >1 to ensure that it wasn't a client lib problem).

My config is very simple: I am running the client and a single Zookeeper
node on the same machine (my laptop).  There are no other clients of the
Zookeeper node while I am running these tests, so there is no practical
possibility that the JVM is overloaded or GCing.

Here is a very basic kazoo client that I am using.  Obviously it isn't
doing any "real" work right now - this is just to demonstrate the
disconnects.

#!/usr/bin/env python2.7

from kazoo.client import KazooClient, KazooState
import logging
import time

def my_listener(state):
  if state == KazooState.LOST:
# Register somewhere that the session was lost
logging.warning('handle lost')
  elif state == KazooState.SUSPENDED:
# Handle being disconnected from Zookeeper
logging.debug('handle being disconnected')
  else:
# Handle being connected/reconnected to Zookeeper
logging.debug('handle being (re)connected')


if __name__ == '__main__':
  logging.basicConfig(format='%(asctime)-15s %(levelname)s %(message)s',
level=logging.DEBUG)

  logging.debug('starting...')
  zk = KazooClient(hosts='127.0.0.1:2181', timeout=30)
  zk.start()
  zk.add_listener(my_listener)

  time.sleep(35)

  zk.stop()



Here is my zookeeper config:

# The number of milliseconds of each tick
tickTime=2000

# The number of ticks that the initial
# synchronization phase can take
initLimit=10

# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5

# the directory where the snapshot is stored.
dataDir=/Users/ianrose/Code/zookeeper/var/data

# the port at which the clients will connect
clientPort=2181

# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=5

# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=1





Here is the output I get on the client:

2014-09-26 12:43:20,603 DEBUG starting...
2014-09-26 12:43:20,604 INFO Connecting to 127.0.0.1:2181
2014-09-26 12:43:20,605 DEBUG Sending request(xid=None):
Connect(protocol_version=0, last_zxid_seen=0, time_out=3, session_id=0,
passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
read_only=None)
2014-09-26 12:43:20,609 INFO Zookeeper connection established, state:
CONNECTED
2014-09-26 12:43:40,222 WARNING Connection dropped: outstanding heartbeat
ping not received
2014-09-26 12:43:40,222 WARNING Transition to CONNECTING
2014-09-26 12:43:40,222 INFO Zookeeper connection lost
2014-09-26 12:43:40,222 DEBUG handle being disconnected
2014-09-26 12:43:40,747 INFO Connecting to 127.0.0.1:2181
2014-09-26 12:43:40,748 DEBUG Sending request(xid=None):
Connect(protocol_version=0, last_zxid_seen=0, time_out=3,
session_id=92520388231233540,
passwd='\xd1M\xb9\xb3\xae\xab\xa1!@x\x06nv\xb7\xe3*', read_only=None)
2014-09-26 12:43:40,750 INFO Zookeeper connection established, state:
CONNECTED
2014-09-26 12:43:40,751 DEBUG handle being (re)connected
2014-09-26 12:43:55,611 DEBUG Sending request(xid=1): Close()
2014-09-26 12:43:55,614 INFO Closing connection to 127.0.0.1:2181
2014-09-26 12:43:55,615 INFO Zookeeper session lost, state: CLOSED
2014-09-26 12:43:55,615 WARNING handle lost


And here is trace-level logging from the server side:

2014-09-26 12:43:20,605 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197][] - Accepted socket
connection from /127.0.0.1:58959
2014-09-26 12:43:20,605 [myid:] - DEBUG [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@810][] - Session establishment request
from client /127.0.0.1:58959 client's lastZxid is 0x0
2014-09-26 12:43:20,605 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868][] - Client attempting to
establish new session at /127.0.0.1:58959
2014-09-26 12:43:20,605 [myid:] - TRACE [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooTrace@71][] - SessionTrackerImpl --- Adding session
0x148b2cd8b010004 3
2014-09-26 12:43:20,606 [myid:] - TRACE [ProcessThread(sid:0
cport:-1)::ZooTrace@90][] - :Psessionid:0x148b2cd8b010004
type:createSession cxid:0x0 zxid:0xfffe txntype:unknown
reqpath:n/a
2014-09-26 12:43:20,606 [myid:] - TRACE [ProcessThread(sid:0
cport:-1)::ZooTrace@71][] - SessionTrackerImpl --- Existing session
0x148b2cd8b010004 3
2014-0