[ https://issues.apache.org/jira/browse/KUDU-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
martha updated KUDU-1483: ------------------------- Attachment: 375.gz > in some cases, followers cannot promote to leader. > -------------------------------------------------- > > Key: KUDU-1483 > URL: https://issues.apache.org/jira/browse/KUDU-1483 > Project: Kudu > Issue Type: Bug > Components: consensus > Reporter: zhangsong > Priority: Major > Attachments: 375.gz > > > in my env, a tablet only has two follower on master's webui, that situation > last forever. > Some logs about the tablet on two followers log: > follower1: > I0613 11:16:33.244365 26846 leader_election.cc:223] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e > [CANDIDATE]: Term 31717 election: Requesting vote from > peer 8cf59ddd6d154ae99d3b23da840169e0W0613 11:16:33.247150 26016 > leader_election.cc:281] T 87588b06c65d4898a5b8c29d08b3528d P > eded59517b14432ab9022cd50d160b8e [CANDIDATE]: Term 31717 election: Tablet > error from VoteRequest() call to peer 8cf59ddd6d154ae99d3b23da840169e0: > Illegal state: Tablet not RUN > NING: FAILED: Not found: Can't find block: 1363326557009763249I0613 > 11:16:33.247463 26016 leader_election.cc:248] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e > [CANDIDATE]: Term 31717 election: Election decided. Re > sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an > additional 15.536s > I0613 11:16:33.248245 17534 raft_consensus.cc:1795] T > 87588b06c65d4898a5b8c29d08b3528d P > eded59517b14432ab9022cd50d160b8e [term 31717 FOLLOWER]: Leader election lost > for term 3 > 1717. Reason: None given > sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an > additional 15.536sI0613 11:16:33.248245 17534 raft_consensus.cc:1795] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Leader election lost for term 31717. Reason: None given > I0613 11:16:34.288436 26137 raft_consensus.cc:1298] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Handling vote request from an unknown peer > 95bc8f3637ed4a52b53a984052ba6114 > I0613 11:16:34.288633 26137 raft_consensus.cc:1558] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate > 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31666. Current term is > 31717. > I0613 11:16:41.506261 26127 raft_consensus.cc:1298] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Handling vote request from an unknown peer > 95bc8f3637ed4a52b53a984052ba6114 > I0613 11:16:41.506325 26127 raft_consensus.cc:1558] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate > 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31667. Current term is > 31717. > I0613 11:16:45.440551 26135 raft_consensus.cc:1298] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Handling vote request from an unknown peer > 95bc8f3637ed4a52b53a984052ba6114 > I0613 11:16:45.440625 26135 raft_consensus.cc:1558] T > 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate > 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is > 31717. > it seems that there are three follower/voters and one of it has tablet in > "not running" state. > on the other follower: > W0613 11:16:45.437863 18782 leader_election.cc:281] T > 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 > [CANDIDATE]: Term 31668 election: Tablet error from VoteRequest() call to > peer 8cf59ddd6d154ae99d3b23da840169e0: Illegal state: Tablet not RUNNING: > FAILED: Not found: Can't find block: 1363326557009763249 > W0613 11:16:45.438611 18782 leader_election.cc:333] T > 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 > [CANDIDATE]: Term 31668 election: Vote denied by peer > eded59517b14432ab9022cd50d160b8e with higher term. Message: Invalid argument: > T 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term > 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate > 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is > 31717. > I0613 11:16:45.439034 18782 leader_election.cc:336] T > 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 > [CANDIDATE]: Term 31668 election: Cancelling election due to peer responding > with higher term > I0613 11:16:45.440032 21807 raft_consensus.cc:1942] T > 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 [term > 31668 FOLLOWER]: Snoozing failure detection for election timeout plus an > additional 15.493s > And this logs repeat again and again, it seems that follower with low term > start leader election and get denied by followers with high term, and the > follower with high term doesn't kown about the first follower for some reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)