Adar Dembo has posted comments on this change. Change subject: KUDU-2135 (part 1): add persistent disk states ......................................................................
Patch Set 7: (14 comments) I think Todd should review this. http://gerrit.cloudera.org:8080/#/c/8048/7//COMMIT_MSG Commit Message: PS7, Line 11: that "for when" instead? Not sure, that doesn't sound great either. http://gerrit.cloudera.org:8080/#/c/8048/7/src/kudu/fs/data_dirs-test.cc File src/kudu/fs/data_dirs-test.cc: Line 448: CHECK_OK(pb_util::ReadPBContainerFromPath(env_, instance_filename, pb.get())); Can't use ASSERT_OK here? Line 474: TEST_F(DataDirManagerTest, TestUpdateInstanceFiles) { Suspiciously similar test name as TestUpgradeInstanceFiles. Perhaps rename one or the other (or both)? http://gerrit.cloudera.org:8080/#/c/8048/7/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: Line 437: // Load the directory for all data dirs specified by the user. This comment seems misplaced, and perhaps even confused. What does "Load the directory..." mean? We're loading an instance file and creating a DataDir. Line 441: // Open and lock the data dir's metadata instance file. Nit: trim so this complements the comment on L452 better (i.e. don't need to mention locking here). PS7, Line 533: LOG(ERROR) << Substitute("Could not write $0 new instance files; filesystem opened in " : "read-only mode", updated_instances.size()); Why is this an error? If the very first interaction I have with this new code is via the kudu CLI, I'll probably be in read-only mode, and that doesn't seem like an error to me. PS7, Line 546: // From this point onwards, the in-memory maps and on-disk instances must be : // consistent with the main path set. Why is this important to note? Is it significant for initializing the "fullness" status? Line 599: int max_data_dirs = block_manager_type_ == "file" ? (1 << 16) - 1 : kuint32max; Maybe declare this at the top of the function as "const kMaxDataDirs" to emphasize that it's unchanging state? Alternatively, fold it into the condition on L617. Either way, there's a little less state to worry about while reading through this function. PS7, Line 675: path_set_.all_paths(uuid_idx) Nit: pull this out into a cref variable local to the for loop. PS7, Line 960: path_set_.mutable_all_paths(uuid_idx) Nit: pull out into a local cref. PS7, Line 966: LOG(WARNING) << "Directory health states cannot be written; filesystem is " : "opened read-only mode"; Does it make sense for this to be a WARNING and not INFO, given that the kudu CLI tool is generally read-only? Line 991: std::lock_guard<Mutex> lock(write_instance_mutex_); Should add a comment explaining why serialization of WritePathHealthStates is important. http://gerrit.cloudera.org:8080/#/c/8048/7/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: Line 439: // Returns the instances in the order of the main path set's 'all_paths'. Why is this important? Seems like we could cut down on some complexity if we didn't provide this (and if the IndexDirs caller didn't need to use it). If this is to satisfy the input requirements of UpdateInstanceFiles; perhaps we could perform the reordering there? Then there'll be less coupling between these functions. PS7, Line 440: IndexDirs Maybe InitDataDirMaps or InitDataDirIndexes instead? That emphasizes that this is an initialization function, which elsewhere in the Kudu codebase implies it only runs at startup time. -- To view, visit http://gerrit.cloudera.org:8080/8048 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes