[ 
https://issues.apache.org/jira/browse/KUDU-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

martha updated KUDU-1483:
-------------------------
    Comment: was deleted

(was: [^375.gz])

> in some cases, followers cannot promote to leader.
> --------------------------------------------------
>
>                 Key: KUDU-1483
>                 URL: https://issues.apache.org/jira/browse/KUDU-1483
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>            Reporter: zhangsong
>            Priority: Major
>         Attachments: 375.gz
>
>
> in my env, a tablet only has two follower on master's webui, that situation 
> last forever.
> Some logs about the tablet on two followers log:
> follower1:
>  I0613 11:16:33.244365 26846 leader_election.cc:223] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e 
> [CANDIDATE]: Term 31717 election: Requesting vote from
>  peer 8cf59ddd6d154ae99d3b23da840169e0W0613 11:16:33.247150 26016 
> leader_election.cc:281] T 87588b06c65d4898a5b8c29d08b3528d P 
> eded59517b14432ab9022cd50d160b8e [CANDIDATE]: Term 31717 election: Tablet 
> error from VoteRequest() call to peer 8cf59ddd6d154ae99d3b23da840169e0: 
> Illegal state: Tablet not RUN
> NING: FAILED: Not found: Can't find block: 1363326557009763249I0613 
> 11:16:33.247463 26016 leader_election.cc:248] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e 
> [CANDIDATE]: Term 31717 election: Election decided. Re
> sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an 
> additional 15.536s
> I0613 11:16:33.248245 17534 raft_consensus.cc:1795] T 
> 87588b06c65d4898a5b8c29d08b3528d P
>  eded59517b14432ab9022cd50d160b8e [term 31717 FOLLOWER]: Leader election lost 
> for term 3
> 1717. Reason: None given
> sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an 
> additional 15.536sI0613 11:16:33.248245 17534 raft_consensus.cc:1795] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Leader election lost for term 31717. Reason: None given
> I0613 11:16:34.288436 26137 raft_consensus.cc:1298] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Handling vote request from an unknown peer 
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:34.288633 26137 raft_consensus.cc:1558] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate 
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31666. Current term is 
> 31717.
> I0613 11:16:41.506261 26127 raft_consensus.cc:1298] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Handling vote request from an unknown peer 
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:41.506325 26127 raft_consensus.cc:1558] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate 
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31667. Current term is 
> 31717.
> I0613 11:16:45.440551 26135 raft_consensus.cc:1298] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Handling vote request from an unknown peer 
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:45.440625 26135 raft_consensus.cc:1558] T 
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate 
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is 
> 31717.
> it seems that there are three follower/voters  and one of it has tablet in 
> "not running" state.
> on the other follower:
> W0613 11:16:45.437863 18782 leader_election.cc:281] T 
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 
> [CANDIDATE]: Term 31668 election: Tablet error from VoteRequest() call to 
> peer 8cf59ddd6d154ae99d3b23da840169e0: Illegal state: Tablet not RUNNING: 
> FAILED: Not found: Can't find block: 1363326557009763249
> W0613 11:16:45.438611 18782 leader_election.cc:333] T 
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 
> [CANDIDATE]: Term 31668 election: Vote denied by peer 
> eded59517b14432ab9022cd50d160b8e with higher term. Message: Invalid argument: 
> T 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term 
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate 
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is 
> 31717.
> I0613 11:16:45.439034 18782 leader_election.cc:336] T 
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 
> [CANDIDATE]: Term 31668 election: Cancelling election due to peer responding 
> with higher term
> I0613 11:16:45.440032 21807 raft_consensus.cc:1942] T 
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 [term 
> 31668 FOLLOWER]: Snoozing failure detection for election timeout plus an 
> additional 15.493s
> And this logs repeat again and again, it seems that follower with low term 
> start leader election and get denied by followers with high term, and the 
> follower with high term doesn't kown about the first follower for some reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to