Mike Percy has posted comments on this change. Change subject: [tools] Manual recovery tools (part 1) ......................................................................
Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4834/1/src/kudu/tools/tool_action_local_replica.cc File src/kudu/tools/tool_action_local_replica.cc: PS1, Line 276: RETURN_NOT_OK(TSTabletManager::DeleteTabletData( : meta, TabletDataState::TABLET_DATA_DELETED, boost::none)); > Hmmm.... current notion is that we use 'local_replica delete' when we could > I am still trying to think of a situation where we want to tombstone a bad > tablet. Tombstoning should be the default action. Tombstoning actually removes all actual data and WALs so it's hard to imagine a lot of scenarios where a startup crash will result from such a stripped-down tablet: it would have to be a problem with the superblock or the consensus metadata. If you don't tombstone, but instead manually delete, then it's possible to double-vote which can cause split-brain scenarios in Raft. So a non-tombstone delete is dangerous. > I think one important piece as Todd pointed out in recovery doc is to be able > to exactly identify which tablet is causing the crash. Agreed. > Having a tombstone option for a remote_replica may be useful. Isn't that already implemented? -- To view, visit http://gerrit.cloudera.org:8080/4834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I113a25e9b6c14f7c3814140917b61e35030b58d0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes