Todd Lipcon created KUDU-2156:
---------------------------------

             Summary: Raft should reset backoff counter when a current leader 
contacts replica
                 Key: KUDU-2156
                 URL: https://issues.apache.org/jira/browse/KUDU-2156
             Project: Kudu
          Issue Type: Bug
          Components: consensus
    Affects Versions: 1.5.0
            Reporter: Todd Lipcon


RaftConsensus maintains a failed_elections_since_stable_leader_ counter used to 
cause elections to back off on failure. However, if a replica is partitioned 
for a while such that it calls some failed pre-elections, and then gets 
re-connected to the cluster such that the leader never changed, the counter 
will remain high. It only resets on an actual leader change. This means that if 
the leader did actually fail in the future, the failure would not be detected 
for a potentially long time.

Instead we should reset the counter on any successful update from the leader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to