[ https://issues.apache.org/jira/browse/ZOOKEEPER-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957565#comment-16957565 ]
Alex Mirgorodskiy commented on ZOOKEEPER-3591: ---------------------------------------------- Forgot to mention that after the exception, the leader immediately resynced Instance 6 once more, but now via DIFF rather than Snapshot, presumably since the first attempt advanced peerLastZxid from 0 to 0x6000002df. First resync in the attached instance1.log.gz: > 2019-10-13T16:14:19.271+0000 [.LearnerHandler-/10.80.209.138:58804] Follower > sid: 6 : info : 10.80.209.138:2888:3888:observer;0.0.0.0:2181 > 2019-10-13T16:14:19.278+0000 [.LearnerHandler-/10.80.209.138:58804] On disk > txn sync enabled with snapshotSizeFactor 0.33 > 2019-10-13T16:14:19.279+0000 [.LearnerHandler-/10.80.209.138:58804] > Synchronizing with Follower sid: 6 maxCommittedLog=0x6000002dd > minCommittedLog=0x6000000e9 lastProcessedZxid=0x6000002dd peerLastZxid=0x0 > 2019-10-13T16:14:19.283+0000 [.LearnerHandler-/10.80.209.138:58804] Unable to > find proposals from txnlog for zxid: 0 > 2019-10-13T16:14:19.283+0000 [.LearnerHandler-/10.80.209.138:58804] Sending > snapshot last zxid of peer is 0x0, zxid of leader is 0x6000002de, send zxid > of db as 0x6000002dd, 1 concurrent snapshots, snapshot was not exempt from > throttle Reconfig: > 2019-10-13T16:14:19.323+0000 [.ProcessThread(sid:1 cport:-1):] Incremental > reconfig Second resync: > 2019-10-13T16:14:19.484+0000 [.LearnerHandler-/10.80.209.138:58814] Follower > sid: 6 : info : 10.80.209.138:2888:3888:participant;0.0.0.0:2181 > 2019-10-13T16:14:19.485+0000 [.LearnerHandler-/10.80.209.138:58814] On disk > txn sync enabled with snapshotSizeFactor 0.33 > 2019-10-13T16:14:19.486+0000 [.LearnerHandler-/10.80.209.138:58814] > Synchronizing with Follower sid: 6 maxCommittedLog=0x6000002e4 > minCommittedLog=0x6000000f0 lastProcessedZxid=0x6000002e4 > peerLastZxid=0x6000002df > 2019-10-13T16:14:19.486+0000 [.LearnerHandler-/10.80.209.138:58814] Using > committedLog for peer sid: 6 > 2019-10-13T16:14:19.487+0000 [.LearnerHandler-/10.80.209.138:58814] Sending > DIFF zxid=0x6000002e4 for peer sid: 6 > Inconsistent resync with dynamic reconfig > ----------------------------------------- > > Key: ZOOKEEPER-3591 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3591 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.5.5 > Reporter: Alex Mirgorodskiy > Priority: Major > Attachments: instance1.log.gz, instance6.log.gz > > > We've run into a problem where one of the zookeeper instances lost most of > its data after its zk process has been restarted. We suspect an interaction > between dynamic reconfiguration and snapshot-based resync of that instance. > Details and some amateurish analysis are below. We can also upload > transaction logs, if need be. > We have a 6-instance ensemble running version 3.5.5 with 3 quorum members and > 3 observers. One of the observers (Instance 6) saw its db shrink from 3162 > znodes down to 10 after that instance restarted: > > 2019-10-13T16:44:19.060+0000 [.zk-monitor-0] Monitor command mntr: > > zk_version 3.5.5-afd10a8846b22a34c5a818034bb22e99dd44587b, built on > > 09/16/2019 18:31 GMT > > zk_znode_count 3162 > > -- > > 2019-10-13T16:48:32.713+0000 [.zk-monitor-0] Monitor command mntr: > > zk_version 3.5.5-afd10a8846b22a34c5a818034bb22e99dd44587b, built on > > 09/16/2019 18:31 GMT > > zk_znode_count 10 > Contrast it with Instance 1 that was the leader at the time, and whose > znode_count remained stable around 3000: > > 2019-10-13T16:44:48.625+0000 [.zk-monitor-0] Monitor command mntr: > > zk_version 3.5.5-afd10a8846b22a34c5a818034bb22e99dd44587b, built on > > 09/16/2019 18:31 GMT > > zk_znode_count 3178 > > -- > > ... > > -- > > 2019-10-13T16:48:48.731+0000 [.zk-monitor-0] Monitor command mntr: > > zk_version 3.5.5-afd10a8846b22a34c5a818034bb22e99dd44587b, built on > > 09/16/2019 18:31 GMT > > zk_znode_count 3223 > It appears that the problem had happened 30 minutes earlier, when Instance 6 > got resynced from the leader via the Snap method, yet skipped creating an > on-disk snapshot. The end result was that the in-memory state was fine, but > there was only the primordial snapshot.0 on disk, and transaction logs only > started after the missing snapshot: > $ ls -l version-2 > > total 1766 > > -rw-r--r-- 1 daautomation daautomation 1 Oct 13 09:14 acceptedEpoch > > -rw-r--r-- 1 daautomation daautomation 1 Oct 13 10:12 currentEpoch > > -rw-r--r-- 1 daautomation daautomation 2097168 Oct 13 09:44 log.6000002e0 > > -rw-r--r-- 1 daautomation daautomation 1048592 Oct 13 10:09 log.600001f1b > > -rw-r--r-- 1 daautomation daautomation 4194320 Oct 13 12:16 log.600003310 > > -rw-r--r-- 1 daautomation daautomation 770 Oct 13 09:14 snapshot.0 > So the zk reboot wiped out most of the state. > Dynamic reconfig might be relevant here. Instance 6 started as an observer, > got removed, and immediately re-added as a participant. Instance 2 went the > other way, from participant to observer: > > 2019-10-13T16:14:19.323+0000 ZK reconfig: removing node 6 > > 2019-10-13T16:14:19.359+0000 ZK reconfig: adding > > server.6=10.80.209.138:2888:3888:participant;0.0.0.0:2181 > > 2019-10-13T16:14:19.399+0000 ZK reconfig: adding > > server.2=10.80.209.131:2888:3888:observer;0.0.0.0:2181 > Looking at the logs, Instance 6 started and received a resync snapshot from > the leader right before the dynamic reconfig: > > 2019-10-13T16:14:19.284+0000 > > [.QuorumPeer[myid=6](plain=/0.0.0.0:2181)(secure=disabled)] Getting a > > snapshot from leader 0x6000002dd > > ... > > 2019-10-13T16:14:19.401+0000 > > [.QuorumPeer[myid=6](plain=/0.0.0.0:2181)(secure=disabled)] Got zxid > > 0x6000002de expected 0x1 > Had it processed the NEWLEADER packet afterwards, it would've persisted the > snapshot locally. But there's no NEWLEADER message in the Instance 6 log. > Instead, there's a "changes proposed in reconfig" exception, likely a result > of the instance getting dynamically removed and re-added as a participant: > > 2019-10-13T16:14:19.467+0000 > > [.QuorumPeer[myid=6](plain=/0.0.0.0:2181)(secure=disabled)] Becoming a > > non-voting participant > > 2019-10-13T16:14:19.467+0000 > > [.QuorumPeer[myid=6](plain=/0.0.0.0:2181)(secure=disabled)] Exception when > > observing the leaderjava.lang.Exception: changes proposed in reconfig\n\tat > > org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:506)\n\tat > > > > org.apache.zookeeper.server.quorum.Observer.observeLeader(Observer.java:74)\n\tat > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1258) > Perhaps the NEWLEADER packet was still in the socket, but sitting behing > INFORMANDACTIVATE, whose exception prevented us from processing NEWLEADER? > Also, it may or may not be related, but this area got changed recently as > part of https://issues.apache.org/jira/browse/ZOOKEEPER-3104. -- This message was sent by Atlassian Jira (v8.3.4#803005)