[ https://issues.apache.org/jira/browse/ZOOKEEPER-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaun Senecal updated ZOOKEEPER-1809: ------------------------------------- Attachment: testapp.log complete log file (client and server in the same file) showing a run of the attached testapp when the problem occurred. > ephemeral node node deleted (or recreated?) after session expiry > ---------------------------------------------------------------- > > Key: ZOOKEEPER-1809 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1809 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.4.5 > Reporter: Shaun Senecal > Priority: Minor > Attachments: SessionExpiryTest.java, testapp.log > > > We have been running into a situation where we attempt to recreate our > ephemeral nodes after a session expiry, only to find that the node already > exists. Admittedly, this is only happening when we are aggresively killing > and recreating sessions in a tight loop, but I thought it might point to a > larger issue which may need to be addressed. > Attached is a small app which demonstrates the issue, and a log file (client > and server in the same log) which shows the issue as it occured. Reproducing > the bug is a tedious process of rerunning the test over and over again, but I > have typically been able to reproduce it within 15mins of trying. The test > app is using Curator, however, I think the issue is occuring at the ZK level > since the logs clearly indicate the ephermal node is deleted after the > session expiry. > Interesting bits from the log > {noformat} > ... > 2013/11/06 13:46:03,065 INFO [ConnectionStateManager-0] Recreating node: > /test > ... > 2013/11/06 13:46:03,070 DEBUG [SyncThread:0] Processing request:: > sessionid:0x1422bbb36d10002 type:create cxid:0x2 zxid:0x8 txntype:1 > reqpath:n/a > ... > 2013/11/06 13:46:03,071 DEBUG [main] Closing client for session: > 0x1422bbb36d10002 > 2013/11/06 13:46:03,075 INFO [ProcessThread(sid:0 cport:-1):] Processed > session termination for sessionid: 0x1422bbb36d10002 > 2013/11/06 13:46:03,079 DEBUG [SyncThread:0] Processing request:: > sessionid:0x1422bbb36d10002 type:closeSession cxid:0x1 zxid:0x9 txntype:-11 > reqpath:n/a > 2013/11/06 13:46:03,080 DEBUG [SyncThread:0] Deleting ephemeral node /test > for session 0x1422bbb36d10002 > 2013/11/06 13:46:03,080 DEBUG [SyncThread:0] sessionid:0x1422bbb36d10002 > type:closeSession cxid:0x1 zxid:0x9 txntype:-11 reqpath:n/a > ... > 2013/11/06 13:46:04,459 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:43462] > Client attempting to renew session 0x1422bbb36d10002 at /127.0.0.1:59559 > 2013/11/06 13:46:04,459 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:43462] > Invalid session 0x1422bbb36d10002 for client /127.0.0.1:59559, probably > expired > 2013/11/06 13:46:04,460 INFO [main-SendThread(localhost:43462)] Unable to > reconnect to ZooKeeper service, session 0x1422bbb36d10002 has expired, > closing socket connection > 2013/11/06 13:46:04,460 DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:43462] > Dropping request: No session with sessionid 0x1422bbb36d10002 exists, > probably expired and removed > ... > 2013/11/06 13:46:04,463 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:43462] > Client attempting to establish new session at /127.0.0.1:59560 > 2013/11/06 13:46:04,466 DEBUG [SyncThread:0] Processing request:: > sessionid:0x1422bbb36d10003 type:createSession cxid:0x0 zxid:0xa txntype:-10 > reqpath:n/a > 2013/11/06 13:46:04,466 DEBUG [SyncThread:0] sessionid:0x1422bbb36d10003 > type:createSession cxid:0x0 zxid:0xa txntype:-10 reqpath:n/a > 2013/11/06 13:46:04,467 INFO [SyncThread:0] Established session > 0x1422bbb36d10003 with negotiated timeout 30000 for client /127.0.0.1:59560 > ... > 2013/11/06 13:46:04,473 INFO [ConnectionStateManager-0] Recreating node: > /test > 2013/11/06 13:46:04,474 DEBUG [SyncThread:0] Processing request:: > sessionid:0x1422bbb36d10003 type:exists cxid:0x2 zxid:0xfffffffffffffffe > txntype:unknown reqpath:/___CURATOR_KILL_SESSION___15970538640754 > 2013/11/06 13:46:04,474 DEBUG [SyncThread:0] sessionid:0x1422bbb36d10003 > type:exists cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown > reqpath:/___CURATOR_KILL_SESSION___15970538640754 > 2013/11/06 13:46:04,475 INFO [ProcessThread(sid:0 cport:-1):] Got user-level > KeeperException when processing sessionid:0x1422bbb36d10003 type:create > cxid:0x3 zxid:0xc txntype:-1 reqpath:n/a Error Path:/test > Error:KeeperErrorCode = NodeExists for /test > 2013/11/06 13:46:04,475 DEBUG [main-SendThread(localhost:43462)] Reading > reply sessionid:0x1422bbb36d10003, packet:: clientPath:null serverPath:null > finished:false header:: 2,3 replyHeader:: 2,11,-101 request:: > '/___CURATOR_KILL_SESSION___15970538640754,T response:: > 2013/11/06 13:46:04,476 INFO [main] Initiating client connection, > connectString=127.0.0.1:43462 sessionTimeout=10000 > watcher=com.netflix.curator.test.KillSession$2@4067d00a > sessionId=1422bbb36d10003 sessionPasswd=<hidden> > ... > 2013/11/06 13:46:04,479 ERROR [ConnectionStateManager-0] Failed to recreate > ephemeral node > org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = > NodeExists for /test > at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > at > com.netflix.curator.framework.imps.CreateBuilderImpl$10.call(CreateBuilderImpl.java:625) > at > com.netflix.curator.framework.imps.CreateBuilderImpl$10.call(CreateBuilderImpl.java:609) > at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:106) > at > com.netflix.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:605) > at > com.netflix.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:428) > at > com.netflix.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:408) > at > com.netflix.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:41) > at > com.rakuten.sandbox.sessionexpiry.nodeexists.SessionExpiryTest$2.stateChanged(SessionExpiryTest.java:72) > at > com.netflix.curator.framework.state.ConnectionStateManager$2.apply(ConnectionStateManager.java:184) > at > com.netflix.curator.framework.state.ConnectionStateManager$2.apply(ConnectionStateManager.java:180) > at > com.netflix.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:262) > at > com.netflix.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:83) > at > com.netflix.curator.framework.state.ConnectionStateManager.processEvents(ConnectionStateManager.java:177) > at > com.netflix.curator.framework.state.ConnectionStateManager.access$000(ConnectionStateManager.java:40) > at > com.netflix.curator.framework.state.ConnectionStateManager$1.call(ConnectionStateManager.java:104) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)