hong created ZOOKEEPER-4352:
-------------------------------
Summary: When minCommittedLog=0, only diff data is sent, and
snapshots cannot be sent, resulting in inconsistent data
Key: ZOOKEEPER-4352
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4352
Project: ZooKeeper
Issue Type: Bug
Components: quorum
Affects Versions: 3.7.0, 3.6.3
Reporter: hong
Attachments: zk-2.log, zk-3.log
leader is zookeeper-0, it add zk-1and zk-2. when adding zk-3, zk-2 become
leader, but it minCommittedLog=0. After adding zk-3, zk-3 just revice the DIFF
message from zk-2, so resulting in inconsistent data.
Is there any error?
{code:java}
//代码占位符
else if ((maxCommittedLog >= peerLastZxid) && (minCommittedLog <=
peerLastZxid)) { // Follower is within commitLog range
LOG.info("Using committedLog for peer sid: {}", getSid());
Iterator<Proposal> itr = db.getCommittedLog().iterator();
currentZxid = queueCommittedProposals(itr, peerLastZxid, null,
maxCommittedLog); needSnap = false; }
{code}
there are logs:
###ZK-2
021-08-26 02:02:07,698 [myid:3] - INFO [NIOWorkerThread-1:NIOServerCnxn@507] -
Processing ruok command from /127.0.0.1:56458
2021-08-26 02:02:08,046 [myid:3] - INFO
[ListenerHandler-zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:3888:QuorumCnxManager$Listener$ListenerHandler@1070]
- Received connection request from /10.233.64.36:50788
2021-08-26 02:02:08,072 [myid:3] - INFO
[LearnerHandler-/10.233.64.36:60310:LearnerHandler@504] - Follower sid: 4 :
info :
zookeeper-3.zookeeper-headless.default.svc.cluster.local:2888:3888:observer;0.0.0.0:2181
2021-08-26 02:02:08,073 [myid:3] - INFO
[LearnerHandler-/10.233.64.36:60310:ZKDatabase@345] - On disk txn sync enabled
with snapshotSizeFactor 0.33
2021-08-26 02:02:08,074 [myid:3] - INFO
[LearnerHandler-/10.233.64.36:60310:LearnerHandler@807] - Synchronizing with
Learner sid: 4 maxCommittedLog=0x200000003 minCommittedLog=0x0
lastProcessedZxid=0x200000003 peerLastZxid=0x0
2021-08-26 02:02:08,074 [myid:3] - INFO
[LearnerHandler-/10.233.64.36:60310:LearnerHandler@871] - Using committedLog
for peer sid: 4
2021-08-26 02:02:08,074 [myid:3] - INFO
[LearnerHandler-/10.233.64.36:60310:LearnerHandler@979] - Sending DIFF
zxid=0x200000003 for peer sid: 4
2021-08-26 02:02:09,250 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@471] -
WorkerReceiver is down
2021-08-26 02:02:09,250 [myid:3] - INFO
[WorkerSender[myid=3]:FastLeaderElection$Messenger$WorkerSender@505] -
WorkerSender is down
2021-08-26 02:02:12,898 [myid:3] - INFO [NIOWorkerThread-2:NIOServerCnxn@507] -
Processing ruok command from /127.0.0.1:56516
###ZK-3
2021-08-26 02:02:08,064 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1512]
- The large request threshold is set to -1
2021-08-26 02:02:08,064 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@339] -
Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout
40000 clientPortListenBacklog -1 datadir /data/version-2 snapdir /data/version-2
2021-08-26 02:02:08,064 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ObserverZooKeeperServer@55]
- syncEnabled =true
2021-08-26 02:02:08,066 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer
state changed: observing - discovery
2021-08-26 02:02:08,067 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Observer@163] -
Observing new leader sid=3
addr=zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888
2021-08-26 02:02:08,068 [myid:4] - INFO
[LeaderConnector-zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888:Learner$LeaderConnector@370]
- Successfully connected to leader, using address:
zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888
2021-08-26 02:02:08,072 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer
state changed: observing - synchronization
2021-08-26 02:02:08,073 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@551] - Getting
a diff from the leader 0x200000003
2021-08-26 02:02:08,073 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@869] - Peer
state changed: observing - synchronization - diff
2021-08-26 02:02:08,077 [myid:4] - WARN
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@617] - Got
zxid 0x100000023 expected 0x1
2021-08-26 02:02:08,078 [myid:4] - ERROR
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1836] -
setLastSeenQuorumVerifier called with stale config 4294967332. Current version:
8589934595
2021-08-26 02:02:08,080 [myid:4] - WARN
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@617] - Got
zxid 0x200000001 expected 0x10000002e
2021-08-26 02:02:08,082 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@717] - Learner
received NEWLEADER message
2021-08-26 02:02:08,083 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@869] - Peer
state changed: observing - synchronization
2021-08-26 02:02:08,086 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):CommitProcessor@476] -
Configuring CommitProcessor with readBatchSize -1 commitBatchSize 1
2021-08-26 02:02:08,086 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):CommitProcessor@438] -
Configuring CommitProcessor with 1 worker threads.
2021-08-26 02:02:08,088 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):RequestThrottler@74] -
zookeeper.request_throttler.shutdownTimeout = 10000
2021-08-26 02:02:08,098 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@701] - Learner
received UPTODATE message
2021-08-26 02:02:08,099 [myid:4] - INFO [SyncThread:4:FileTxnLog@284] -
Creating new log file: log.100000023
2021-08-26 02:02:08,099 [myid:4] - INFO
[CommitProcessor:4:LearnerSessionTracker@116] - Committing global session
0x100001170e3000d
2021-08-26 02:02:08,099 [myid:4] - INFO
[QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer
state changed: observing - broadcast
2021-08-26 02:02:08,100 [myid:4] - WARN [CommitProcessor:4:RateLogger@86] -
Message:Digests are not matching. Value is Zxid. Value:4294967331
2021-08-26 02:02:08,101 [myid:4] - ERROR [CommitProcessor:4:DataTree@1832] -
First digest mismatch on txn: 72057668946296845,0,4294967331,1629943304746,-10
, 4000
, expected digest is 2,10124071835
, actual digest is 1371985504,
2021-08-26 02:02:08,102 [myid:4] - INFO
[CommitProcessor:4:LearnerSessionTracker@116] - Committing global session
0x200000b302e0000
--
This message was sent by Atlassian Jira
(v8.3.4#803005)