[ https://issues.apache.org/jira/browse/CURATOR-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730370#comment-16730370 ]
Shay Shimony edited comment on CURATOR-498 at 12/28/18 5:30 PM: ---------------------------------------------------------------- I made some progress investigating this issue. In LeaderLatch.checkLeadership I added the following log print: else if ( ourIndex == 0 ) { log.info("LEADER WITH {}; all: {}", localOurPath, sortedChildren); setLeadership(true); } And in LeaderLatch.setNode also added logs: private void setNode(String newValue) throws Exception { String oldPath = ourPath.getAndSet(newValue); log.info("CHANGED PATH: {} -> {}", oldPath, newValue); if ( oldPath != null ) { log.info("DELETING (in bg) {}", oldPath); client.delete().guaranteed().inBackground().forPath(oldPath); } } And I got the following prints. Full logs are attached. transactionslog.txt is the readable version of the original log.180000006e. Both HaWatcher logs and ZK transaction logs are from the same machine using the same clock. That's HaWatcher (Curator): 2018-12-28 16:32:16 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.1/192.168.56.1:2181, sessionid = *0x1000ae5465c0007*, negotiated timeout = 10000 2018-12-28 16:32:16 INFO ConnectionStateManager:237 - State change: CONNECTED ... 2018-12-28 16:33:02 INFO ClientCnxn:1162 - Unable to read additional data from server sessionid 0x1000ae5465c0007, likely server has closed socket, closing socket connection and attempting reconnect 2018-12-28 16:33:02 INFO ConnectionStateManager:237 - State change: SUSPENDED ... 2018-12-28 16:33:02 INFO ClientCnxn:879 - Socket connection established to 192.168.56.11/192.168.56.11:2181, initiating session 2018-12-28 16:33:02 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.11/192.168.56.11:2181, sessionid = *0x1000ae5465c0007*, negotiated timeout = 10000 2018-12-28 16:33:02 INFO ConnectionStateManager:237 - State change: RECONNECTED 2018-12-28 16:33:02 INFO LeaderLatch3:475 - RESET 2018-12-28 16:33:02 INFO LeaderLatch3:679 - CHANGED PATH: /sites/prod/leader-latch/_c_a04e5212-0b0c-4328-8d3e-0e6ccaedfc59-latch-0000000082 -> null 2018-12-28 16:33:02 INFO LeaderLatch3:682 - DELETING (in bg) /sites/prod/leader-latch/_c_a04e5212-0b0c-4328-8d3e-0e6ccaedfc59-latch-0000000082 2018-12-28 16:33:04 INFO ClientCnxn:1162 - Unable to read additional data from server sessionid 0x1000ae5465c0007, likely server has closed socket, closing socket connection and attempting reconnect 2018-12-28 16:33:04 INFO ConnectionStateManager:237 - State change: SUSPENDED ... 018-12-28 16:33:14 INFO ClientCnxn:879 - Socket connection established to 192.168.56.11/192.168.56.11:2181, initiating session 2018-12-28 16:33:14 INFO ZooKeeper:693 - *Session*: *0x1000ae5465c0007* *closed* 2018-12-28 16:33:14 INFO ZooKeeper:442 - Initiating client connection, connectString=192.168.56.1:2181,192.168.56.11:2181,192.168.56.12:2181 sessionTimeout=10000 watcher=org.apache.curator.ConnectionState@6b143ee9 2018-12-28 16:33:14 INFO ConnectionStateManager:237 - State change: LOST 2018-12-28 16:33:14 INFO ClientCnxn:522 - EventThread *shut down for session: 0x1000ae5465c0007* ... 2018-12-28 16:33:17 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.1/192.168.56.1:2181, sessionid = *0x1000c6bc82c0001*, negotiated timeout = 10000 2018-12-28 16:33:17 INFO ConnectionStateManager:237 - State change: RECONNECTED 2018-12-28 16:33:17 INFO LeaderLatch3:475 - RESET 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: null -> null 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: null -> /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-0000000085 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-*0000000085* => /sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083* 2018-12-28 16:33:17 INFO LeaderLatch3:682 - DELETING (in bg) /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-0000000085 2018-12-28 16:33:18 INFO LeaderLatch3:540 - *LEADER WITH* /sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083*; all: [_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-0000000083, _c_1f76c543-363a-4b95-b722-2db2af047da5-latch-0000000084, _c_61e648df-73f7-4439-9c07-98df4167b1b7-latch-0000000086, _c_06126584-76bf-45ad-816b-58c93db44999-latch-0000000087, _c_33015b8e-3e7c-4ce5-9ae2-4c3b58240a89-latch-0000000088] 2018-12-28 16:33:18 INFO HaWatcher$:49 - GAINED LEADERSHIP in site prod ... Then in the transactions log, only one line mentions 0000000083: 28/12/2018, 16:33:02 IST session *0x1000ae5465c0007* cxid 0x1b zxid 0x180000006f *create* '/failsafe/default/sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083*,#7036,v{s{31,s {'world,'anyone}}},T,84 ... 28/12/2018, 16:33:17 IST session *0x1000c6bc82c0001* cxid 0x0 zxid 0x1900000007 *createSession* 10000 28/12/2018, 16:33:17 IST session *0x1000c6bc82c0001* cxid 0x2 zxid 0x1900000009 create '/failsafe/default/sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-*0000000085*,#7036,v\{s{31,s{'world,'anyone} }},T,86 ... 28/12/2018, 16:33:29 IST session *0x1000ae5465c0007* cxid 0x0 zxid 0x1900000026 *closeSession* null Summarizing the events of HaWatcher and the ZK, it went as follows: # 16:32:16: Session 0x1000ae5465c0007 was created, as reported by this HaWatcher. # 16:33:14: Session 0x1000ae5465c0007 was shut down, as reported by this HaWatcher. # 16:44:17: Session 0x1000c6bc82c0001 was created, as reported by HW. # 16:33:17: Ephemeral znode 0000000085 was created and set in ourPath. # 16:33:17: Ephemeral znode 0000000083, which belongs to session 0x1000ae5465c0007 (that Currator decided as expired), was created and replaced 0000000085, which belongs to the new session 0x1000c6bc82c0001, in ourPath. # 16:33:18: HW became leader with ephemeral znode 0000000083. # 16:33:29: session 0x1000ae5465c0007 expired - as reported by ZK, and so 0000000083 was deleted. So it seems to me that the bug is that znode 0000000083 of expired session (according to Curator) replaced in ourPath and deleted in ZK znode 0000000085 of living session. In step 6, which is method checkLeadership(), 0000000083 is what it finds in ourPath. Since ZK did not delete this znode (because its session did not expire yet, and no client deleted it), sortedChildren.indexOf found it and it was the smallest, thus LeaderLatch found a leader with an ephemeral znode of a session, which is already non-relevant for Curator (was replaced by another session), and about to expire in ZK. was (Author: shayshim): I made some progress investigating this issue. In LeaderLatch.checkLeadership I added the following log print: else if ( ourIndex == 0 ) { log.info("LEADER WITH {}; all: {}", localOurPath, sortedChildren); setLeadership(true); } And in LeaderLatch.setNode also added logs: private void setNode(String newValue) throws Exception { String oldPath = ourPath.getAndSet(newValue); log.info("CHANGED PATH: {} -> {}", oldPath, newValue); if ( oldPath != null ) { log.info("DELETING (in bg) {}", oldPath); client.delete().guaranteed().inBackground().forPath(oldPath); } } And I got the following prints. Full logs are attached. transactionslog.txt is the readable version of the original log.180000006e. Both HaWatcher logs and ZK transaction logs are from the same machine using the same clock. That's HaWatcher (Curator): 2018-12-28 16:32:16 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.1/192.168.56.1:2181, sessionid = *0x1000ae5465c0007*, negotiated timeout = 10000 2018-12-28 16:32:16 INFO ConnectionStateManager:237 - State change: CONNECTED ... 2018-12-28 16:33:02 INFO ClientCnxn:1162 - Unable to read additional data from server sessionid 0x1000ae5465c0007, likely server has closed socket, closing socket connection and attempting reconnect 2018-12-28 16:33:02 INFO ConnectionStateManager:237 - State change: SUSPENDED ... 2018-12-28 16:33:02 INFO ClientCnxn:879 - Socket connection established to 192.168.56.11/192.168.56.11:2181, initiating session 2018-12-28 16:33:02 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.11/192.168.56.11:2181, sessionid = *0x1000ae5465c0007*, negotiated timeout = 10000 2018-12-28 16:33:02 INFO ConnectionStateManager:237 - State change: RECONNECTED 2018-12-28 16:33:02 INFO LeaderLatch3:475 - RESET 2018-12-28 16:33:02 INFO LeaderLatch3:679 - CHANGED PATH: /sites/prod/leader-latch/_c_a04e5212-0b0c-4328-8d3e-0e6ccaedfc59-latch-0000000082 -> null 2018-12-28 16:33:02 INFO LeaderLatch3:682 - DELETING (in bg) /sites/prod/leader-latch/_c_a04e5212-0b0c-4328-8d3e-0e6ccaedfc59-latch-0000000082 2018-12-28 16:33:04 INFO ClientCnxn:1162 - Unable to read additional data from server sessionid 0x1000ae5465c0007, likely server has closed socket, closing socket connection and attempting reconnect 2018-12-28 16:33:04 INFO ConnectionStateManager:237 - State change: SUSPENDED ... 018-12-28 16:33:14 INFO ClientCnxn:879 - Socket connection established to 192.168.56.11/192.168.56.11:2181, initiating session 2018-12-28 16:33:14 INFO ZooKeeper:693 - *Session*: *0x1000ae5465c0007* *closed* 2018-12-28 16:33:14 INFO ZooKeeper:442 - Initiating client connection, connectString=192.168.56.1:2181,192.168.56.11:2181,192.168.56.12:2181 sessionTimeout=10000 watcher=org.apache.curator.ConnectionState@6b143ee9 2018-12-28 16:33:14 INFO ConnectionStateManager:237 - State change: LOST 2018-12-28 16:33:14 INFO ClientCnxn:522 - EventThread *shut down for session: 0x1000ae5465c0007* ... 2018-12-28 16:33:17 INFO ClientCnxn:1303 - Session establishment complete on server 192.168.56.1/192.168.56.1:2181, sessionid = *0x1000c6bc82c0001*, negotiated timeout = 10000 2018-12-28 16:33:17 INFO ConnectionStateManager:237 - State change: RECONNECTED 2018-12-28 16:33:17 INFO LeaderLatch3:475 - RESET 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: null -> null 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: null -> /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-0000000085 2018-12-28 16:33:17 INFO LeaderLatch3:679 - CHANGED PATH: /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-*0000000085* => /sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083* 2018-12-28 16:33:17 INFO LeaderLatch3:682 - DELETING (in bg) /sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-0000000085 2018-12-28 16:33:18 INFO LeaderLatch3:540 - *LEADER WITH* /sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083*; all: [_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-0000000083, _c_1f76c543-363a-4b95-b722-2db2af047da5-latch-0000000084, _c_61e648df-73f7-4439-9c07-98df4167b1b7-latch-0000000086, _c_06126584-76bf-45ad-816b-58c93db44999-latch-0000000087, _c_33015b8e-3e7c-4ce5-9ae2-4c3b58240a89-latch-0000000088] 2018-12-28 16:33:18 INFO HaWatcher$:49 - GAINED LEADERSHIP in site prod ... Then in the transactions log, only one line mentions 0000000083: 28/12/2018, 16:33:02 IST session *0x1000ae5465c0007* cxid 0x1b zxid 0x180000006f *create* '/failsafe/default/sites/prod/leader-latch/_c_0e156808-69bf-4932-b0fd-12d0fb407365-latch-*0000000083*,#7036,v{s{31,s {'world,'anyone}}},T,84 ... 28/12/2018, 16:33:17 IST session *0x1000c6bc82c0001* cxid 0x0 zxid 0x1900000007 *createSession* 10000 28/12/2018, 16:33:17 IST session *0x1000c6bc82c0001* cxid 0x2 zxid 0x1900000009 create '/failsafe/default/sites/prod/leader-latch/_c_d99020be-d50e-4f16-946d-e8dd6ba34a5b-latch-*0000000085*,#7036,v\{s{31,s{'world,'anyone} }},T,86 ... 28/12/2018, 16:33:29 IST session *0x1000ae5465c0007* cxid 0x0 zxid 0x1900000026 *closeSession* null Summarizing the events of HaWatcher and the ZK, it went as follows: # 16:32:16: Session 0x1000ae5465c0007 was created, as reported by this HaWatcher. # 16:33:14: Session 0x1000ae5465c0007 was shut down, as reported by this HaWatcher. # 16:44:17: Session 0x1000c6bc82c0001 was created, as reported by HW. # 16:33:17: Ephemeral znode 0000000085 was created and set in ourPath. # 16:33:17: Ephemeral znode 0000000083, which belongs to session 0x1000ae5465c0007 (that Currator decided as expired), was created and replaced 0000000085, which belongs to the new session 0x1000c6bc82c0001, in ourPath. # 16:33:18: HW became leader with ephemeral znode 0000000083. # 16:33:29: session 0x1000ae5465c0007 expired - as reported by ZK, and so 0000000083 was deleted. So it seems to me that the bug is that znode 0000000083 of expired session (according to Curator) replaced in ourPath and deleted in ZK znode 0000000085 of living session. In step 3, which is method checkLeadership(), 0000000083 is what it finds in ourPath. Since ZK did not delete this znode (because its session did not expire yet, and no client deleted it), sortedChildren.indexOf found it and it was the smallest, thus LeaderLatch found a leader with an ephemeral znode of a session, which is already non-relevant for Curator (was replaced by another session), and about to expire in ZK. > LeaderLatch deletes leader and leaves it hung besides a second leader > --------------------------------------------------------------------- > > Key: CURATOR-498 > URL: https://issues.apache.org/jira/browse/CURATOR-498 > Project: Apache Curator > Issue Type: Bug > Affects Versions: 4.0.1, 4.1.0 > Environment: ZooKeeper 3.4.13, Curator 4.1.0 (selecting explicitly > 3.4.13), Linux > Reporter: Shay Shimony > Assignee: Jordan Zimmerman > Priority: Major > Attachments: HaWatcher.log, LeaderLatch0.java, ha.tar.gz, logs.tar.gz > > > The Curator app I am working on uses the LeaderLatch to select a leader out > of 6 clients. > While testing my app, I noticed that when I make ZK lose its quorum for a > while and then restore it, then after Curator in my app restores it's > connection to ZK - sometimes not all the 6 clients are found in the latch > path (using zkCli.sh). That is, I have 5 instead of 6. > After investigating a little, I have a suspicion that LeaderLatch deleted the > leader in method setNode. > To investigate it I copied the LeaderLatch code and added some log messages, > and from them it seems like very old create() background callback was > surprisingly scheduled and corrupted the current leader with its stale path > name. Meaning, this old one called setNode with its stale name, and set > itself instead of the leader and deleted the leader. This leaves client > running, thinking it is the leader, while another leader is selected. > If my analysis is correct then it seems like we need to make this obsolete > create callback cancelled (I think its session was suspended on 22:38:54 and > then lost on 22:39:04 - so on SUSPENDED cancel ongoing callbacks). > Please see attached log file and modified LeaderLatch0. > > In the log, note that on 22:39:26 it shows that 0000000485 is replaced by > 0000000480 and then probably deleted. > Note also that at 22:38:52, 34 seconds before, we can see that it was in the > reset() method ("RESET OUR PATH") and possibly triggered the creation of > 0000000480 then. -- This message was sent by Atlassian JIRA (v7.6.3#76005)