[ https://issues.apache.org/jira/browse/KUDU-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated KUDU-1613: ------------------------------ Priority: Major (was: Critical) Reducing priority of this, since it's a particular specific case of reformatting a server on a 3-node cluster (on a bigger cluster it would find a new replica location) > Under certain circumstances, tablet leader does not evict failed replica > ------------------------------------------------------------------------ > > Key: KUDU-1613 > URL: https://issues.apache.org/jira/browse/KUDU-1613 > Project: Kudu > Issue Type: Bug > Components: consensus, tablet > Affects Versions: 1.0.0 > Reporter: Adar Dembo > Assignee: Dinesh Bhat > > Dan found this while working on Kudu training material. > Suppose you have a three node cluster and a table with a singleton tablet > (replicated three times). Now suppose you stopped one tserver, deleted all of > its on-disk data, then restarted it. > You would expect the following: > # The tablet's leader replica can no longer reach the replica on the > reformatted tserver. > # The leader will evict that replica. > # The master will notice the tablet's under-replication and ask the leader to > add a new replica, probably on the reformatted node. > Instead, there's no eviction at all. The leader replica keeps spewing > messages like this in its log: > {noformat} > W0913 14:13:18.411238 22597 consensus_peers.cc:332] T > 89dfba0c0a714259acf69d9f611e1e92 P 1540ac6e6cb44c2c9f9c6c6c98fd61f7 -> Peer > cc2ef23f1c2c42b7a6a02d7183d92884 (dan-test-g-2.gce.cloudera.com:7050): > Couldn't send request to peer cc2ef23f1c2c42b7a6a02d7183d92884 for tablet > 89dfba0c0a714259acf69d9f611e1e92. Error code: WRONG_SERVER_UUID (16). Status: > Invalid argument: UpdateConsensus: Wrong destination UUID requested. Local > UUID: ef3ea81d59fc4a91b754cfe63b21e6ee. Requested UUID: > cc2ef23f1c2c42b7a6a02d7183d92884. Retrying in the next heartbeat period. > Already tried 5821 times. > {noformat} > Having looked at the code responsible for starting replica eviction > (PeerMessageQueue::RequestForPeer) and the code spewing that error > (Peer::ProcessResponseError), I think I see what's going on. The eviction > code in RequestforPeer() checks the peer's "last successful communication > time" to decide whether to evict or not. Intuitively you'd expect that time > to be updated only when the peer responds successfully, but there are a > couple cases in Peer::ProcessResponseError where we update the last > communication time anyway. Notably: > # If the RPC controller yielded a RemoteError, or > # If the RPC controller had no error but the response itself contained an > error, and the error's code was not TABLET_NOT_FOUND, or > # If the RPC controller and the response had no error, but the response's > status had an error, and that error's code was CANNOT_PREPARE. > I think we're hitting case #2, because there should be no RPC controller > error (the reformatted tserver did respond to the leader replica), but the > response does contain a WRONG_SERVER_UUID error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)