Andrew Wong has posted comments on this change. Change subject: KUDU-1407: reassign failed tablets ......................................................................
Patch Set 17: (8 comments) http://gerrit.cloudera.org:8080/#/c/7440/17//COMMIT_MSG Commit Message: PS17, Line 30: is added > this is repeated Done PS17, Line 31: failed tablets while running > tablets that fail while running (due to what?) Done http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/consensus/consensus_queue.cc File src/kudu/consensus/consensus_queue.cc: Line 629: NotifyObserversOfFailedFollower(peer_uuid, current_term, reason); > nit: No need to hold the lock while calling this method. Done http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: Line 170: DEFINE_bool(master_tombstone_failed_tablet_replicas, true, > Should be removed per below. See master_tombstone_evicted_tablet_replica Done PS17, Line 2473: if (FLAGS_master_tombstone_failed_tablet_replicas) { : SendDeleteReplicaRequest(report.tablet_id(), TABLET_DATA_TOMBSTONED, : boost::none, : tablet->table(), ts_desc->permanent_uuid(), : Substitute("Tablet failed: $0", s.ToString())); : } > Is this required? The leader will now evict a failed follower because of th I think you're right; when the leader sees the failed tablet, it should evict and config change, and then report to the master. http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: PS17, Line 655: metadata > Couldn't this simply happen if one of the data disks failed? Failures when writing the data directory are passed off as WARN_NOT_OK() (see TabletMetadata::DeleteOrphanedBlocks), since the blocks can always be removed in the future (eg when we next startup). PS17, Line 658: is unclear > Shouldn't the contract of DeleteTabletData() be a crash-consistent one? In Ah, I see. I'll update the comment. Line 752: auto fail_tablet = MakeScopedCleanup([&]() { > I like this approach. It is quite clean indeed! Credit to Adar for the suggestion. -- To view, visit http://gerrit.cloudera.org:8080/7440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345 Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes