[ https://issues.apache.org/jira/browse/ZOOKEEPER-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Hunt updated ZOOKEEPER-1412: ------------------------------------ Attachment: ZOOKEEPER-1412_trunk.patch These three patches fix the issue on the respective branches. I verified the tests failed before the fix, and passed subsequent to the fix. All tests are passing for me on the three branches. I believe accessing/returning the lastzxid as I do in FinalRequestProcessor is valid, but it took me some time to convince myself. Please double check that the lastzxid I send back to the client is the one corresponding to the zxid at the time the read was performed > java client watches inconsistently triggered on reconnect > --------------------------------------------------------- > > Key: ZOOKEEPER-1412 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1412 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.3.3, 3.3.4, 3.4.0, 3.4.1, 3.4.2, 3.4.3 > Reporter: Botond Hejj > Assignee: Patrick Hunt > Priority: Blocker > Fix For: 3.3.5, 3.4.4, 3.5.0 > > Attachments: ZOOKEEPER-1412_br33.patch, ZOOKEEPER-1412_br34.patch, > ZOOKEEPER-1412_trunk.patch > > > I've observed an inconsistent behavior in java client watches. The > inconsistency relates to the behavior after the client reconnects to the > zookeeper ensemble. > After the client reconnects to the ensemble only those watches should trigger > which should have been triggered also if the connections was not lost. This > means if I watch for changes in node /foo and there is no change there than > my watch should not be triggered on reconnecting to the ensemble. > This is not always the case in the java client. > I've debugged the issues and I could locate the case when the watch is always > triggered on reconnect. This is consistently happening if I connect to a > follower in the ensemble and I don't do any operation which goes through the > leader. > Looking at the code I see that the client stores the lastzxid and sends that > with its request. This is 0 on startup and will be updated everytime from the > server replies. This lastzxid is also sent to the server after reconnect > together with watches. The server decides which watch to trigger based on > this lastzxid probably because that should mean the last known state of the > client. If this lastzxid is 0 than all the watches are triggered. > I've checked why is this lastzxid 0. I thought it shouldn't be since there > was already a request to the server to set the watch and in the reply the > server could have sent back the zxid but it turns out that it sends just 0. > Looking at the server code I see that for requests which doesn't go through > the leader the follower server just sends back the same zxid that the client > sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira