(moving discussion to dev) Hi Adam,
I can't see a problem with your description about the snapshot generation, but I would expect that replaying the transaction log would bring back the missing transactions. We replay from the zxid in the snapshot name, which is taken before the snapshot starts (FileTxnSnapLog.save(...)). -Flavio > On 16 Jul 2015, at 12:02, Adam Milne-Smith <[email protected]> wrote: > > I've created a jira ticket here: > https://issues.apache.org/jira/browse/ZOOKEEPER-2234 > > Thanks, > Adam > > On 15 Jul 2015 16:07, Adam Milne-Smith <[email protected]> wrote: >> >> Whilst writing a patch for ZOOKEEPER-2141 (3.4.6 branch), we spotted an >> ephemeral node that had not been deleted despite its session having expired. >> Its ACL long did not exist in the ACL cache so any operation against this >> node will fail. >> >> This could lead to things like curator locks never being deleted (even after >> the timeout) and deadlocking applications. >> >> We inspected the code and are reasonably certain that there are no bugs in >> updating the in-memory data tree that could cause this. However serialising >> the snapshot happens asynchronously and follows these 4 steps: >> >> -copy the sessions map >> -serialise the sessions map copy >> -serialise the ACL map (synchronised) >> -serialise the data tree (synchronised at the individual node level) >> >> We suspect the issue we are seeing is a new session and ephemeral node being >> created during the data tree serialisation hence the corresponding session >> and acl are missing from the snapshot but the node is present. This means >> the snapshot contains a partial transaction. >> >> If we were to deserialise from this snapshot then the data in-memory would >> be invalid. If one member of the quorum were to reboot and restore from this >> snapshot, it would contain this node where the other hosts had removed it. >> If this host were to become the leader and send its snapshot to other >> members of the quorum, those would have the invalid data too. >> >> As far as we can see, the only way to delete this node when this happens in >> production would be to perform manual surgery on the snapshot. >> >> Can anyone confirm that they agree this to be the case or let us know if >> we've misunderstood something? >> >> Thanks, >> Adam
