Play with the timings a bit. I consistently see RECONNECTED. Also, make sure you're using the latest Curator. There's a bug fix in 2.0.1 that can affect this.
-JZ On Jun 14, 2013, at 12:09 PM, Evaristo José Camarero <[email protected]> wrote: > Hi there, > > Thanks for the fast response. > > This is the output of the program with your modifications: > > 2013-06-14 21:05:01 INFO CuratorFrameworkImpl:221 - Starting > 2013-06-14 21:05:01 INFO ConnectionStateManager:151 - State change: CONNECTED > NEw state CONNECTED > 2013-06-14 21:05:13 INFO ConnectionStateManager:151 - State change: SUSPENDED > NEw state SUSPENDED > 2013-06-14 21:05:16 ERROR CuratorFrameworkImpl:530 - Background operation > retry gave up > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at > org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:487) > at > org.apache.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:50) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:606) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > 2013-06-14 21:05:16 INFO ConnectionStateManager:151 - State change: LOST > NEw state LOST > > > You can notice that: > "NEW state RECONNECTED never appears" > > what basically means that CuratorFramework listener is not called, and > therefore I assume that Curator does not reconnect. > > > > Something very interesting is the fact that if you remove > "zkClient.create().forPath("/pedro", "juan".getBytes());" from the program > (basically you are not doing any TX, Curator is able to recover. > > Regards, > > Evaristo > > > > De: Jordan Zimmerman <[email protected]> > Para: [email protected]; Evaristo José Camarero > <[email protected]> > Enviado: Viernes 14 de junio de 2013 19:06 > Asunto: Re: CuratorFramework is not recovering connecion with ZK > > Curator uses the timeouts given when the CuratorFramework object is created > to manage the connection. In this instance, after LOST has been received > Curator may wait until session timeout before trying to re-create the > ZooKeeper object. I'm enclosing a modified version of your test that sets the > Curator timeouts to half the time that the test's sleeps wait and you can see > that Curator recovers correctly. > > -Jordan > > > > On Jun 14, 2013, at 7:22 AM, Evaristo José Camarero > <[email protected]> wrote: > >> >> Hi there, >> >> I have found a case in which CuratorFramework is not recovering connection >> with ZK servers. >> >> The use case is the following: >> - Start ZK servers >> - Start application with CuratorFramework >> - ZK servers goes down. >> - ZK servers start again. >> - CuratorFramework app it is not notified that connection is reconnected , >> but neither notifies that can not recover connection in any way, so >> application can not recover. Only option is to restart the app. >> Notice that if CuratorFRamework client is not making any transaction (just >> comment create() TX), the client is able to reconnect. >> >> I attach a program able to reproduce the problem >> >> Regards, >> >> Evaristo >> >> >> >> <CuratorFails.java> > > >
