Andrew Wong has posted comments on this change. Change subject: disk failure: add persistent disk states ......................................................................
Patch Set 15: (30 comments) http://gerrit.cloudera.org:8080/#/c/7270/15//COMMIT_MSG Commit Message: Line 11: failed disk may be partially written and thus should not be used. > I dont quite follow this logic. If we crashed in the middle of writing a bl Right, so the Status.IsDiskFailure() doesn't account for "out of space" errors. In those cases, we'll still crash. There was a bit of discussion about this a couple months ago and it seems like the "out of space" behavior _should_ be different. For now, we just crash, but I don't think it'd be terribly unreasonable to drop some tablets on the out-of-space disk. In any case, yeah, I've updated this. http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/block_manager_util-test.cc File src/kudu/fs/block_manager_util-test.cc: Line 151: TEST_F(KuduTest, CheckIntegrity) { > There's a lot to process here. In total, do you get full coverage of every All paths out of CheckIntegrity(), yes. "Update these instance files" is covered in DataDirManagerTest::TestOpenWithFailedDirs. http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: Line 88: > I don't see this change in PS15. Done. http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: PS15, Line 54: CloneAndPrepend(msg) > CloneAndPrepend() is work; can we defer it to if !s.ok() and we want to ret Done PS15, Line 69: Status::OK() > This is the default value anyway; you can omit health_status_ from the list Done PS15, Line 121: // If the write fails, don't update the timestamp. : uint64_t original_timestamp = pb->path_set().timestamp_us(); : auto reset_timestamp = MakeScopedCleanup([&] { : pb->mutable_path_set()->set_timestamp_us(original_timestamp); : }); > This doesn't seem that important, but OK. Sure, removed. PS15, Line 137: // Some instances on failed disks (e.g. those that failed to canonicalize : // their path) are prefixed with illegal whitespaces to signify failure. > As we discussed offline, rather than modifying the path to indicate a failu Done PS15, Line 197: // Identify the instance that is the most up-to-date and check that there : // are no duplicate UUIDs among the healthy input instances. > Should add that the old invariant was that all instances have to have ident Since that and the TODO are more function-wide comments, I pushed it to the top. PS15, Line 205: max_timestamp > Nit: add the units (max_timestamp_us?). Done Line 208: const PathSetPB path_set = instances[i]->metadata()->path_set(); > Store a cref; no need to make a copy. Done Line 218: updated_instances->insert(instances[i]); > Can we modify a temporary set first, so that if we return an error we avoid Done Line 231: int path_set_uuids_size; > Isn't this always the same as main_uuids.size() after L232-244? Done Line 232: if (main_path_set->all_paths_size() == 0) { > It's rather easy to forget that this means "legacy". Perhaps store it in an Done PS15, Line 234: main_uuids.insert(main_path_set->deprecated_all_uuids().begin(), : main_path_set->deprecated_all_uuids().end()); : path_set_uuids_size = main_path_set->deprecated_all_uuids_size(); > Nit: flip the order so it's the same as L239-243. Done Line 240: DCHECK_GT(path_set_uuids_size, 0); > I think this is trivially correct (by virtue of being in this else statemen Done Line 242: main_uuids.insert(path.uuid()); > Should this InsertOrDie? Or can we expect duplicates? No, this entire if statement is meant to deduplicate the UUIDs. I've added a comment since it seems that was the source of confusion of one of the above comments. PS15, Line 265: depcreated > deprecated Done PS15, Line 267: main_path_updated > main_path_set_updated? Done PS15, Line 302: main_path_set->mutable_all_paths(i) > How about using pb_at_i here (though make it a pointer, not a cref). Done PS15, Line 325: path > path_set Done http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/data_dirs-test.cc File src/kudu/fs/data_dirs-test.cc: PS15, Line 61: // Use log block mode. This should not have much effect on the directory : // manager itself, but log blocks are expected to be more common in production. : static const char* kBlockType = "log"; > This is persisted (and compared) in PathInstanceMetadataFile, so it doesn't Done PS15, Line 94: vector<string> GetDirNames(int num_dirs) { : vector<string> ret; : for (int i = 0; i < num_dirs; i++) { : string dir_name = Substitute("$0-$1", kDirNamePrefix, i); : ret.push_back(GetTestPath(dir_name)); : CHECK_OK(env_->CreateDir(ret[i])); : } : return ret; : } > Don't need this anymore. Done Line 367: class DataDirManagerTest : public DataDirsTest { > Is this class hierarchy absolutely necessary? I'd rather have one fixture t Good point, not particularly. I do think two different SetUp()s are needed, but that warrants two classes, not two subclasses and a superclass. Line 399: FLAGS_env_inject_eio = 1.0; > Does this need to be reset at the end? Or is it OK because of eio_globs? Ah, yeah it should be reset. http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: Line 159: void SetFailed(); > Doc. Removed. Line 394: // Synchronizes the in-memory path sets to match the currently failed dirs. > What's the significance of the return value? Done http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: Line 356: LOG(INFO) << root; > Still want this? Done http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: Line 1287: ASSERT_OK(env_util::CreateDirIfMissing(env_, dir)); > CreateDir() should suffice. Done http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS15, Line 1537: // Blocks should not be placed in failed directories. : limit = 0; > I don't think this has the effect you want. If we use a limit of 0, we'll s Ah, fair point. Actually in this case, I don't think the limit matters--containers shouldn't be put on a failed disk. PS15, Line 1619: FindDataDirByUuidIndex(uuid_indices[i]) > Isn't this the same iteration order as dd_manager_->data_dirs()? That is, c Done -- To view, visit http://gerrit.cloudera.org:8080/7270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes