Adar Dembo has posted comments on this change.

Change subject: KUDU-1634. TS and MS should delete tmp metadata files on startup
......................................................................


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/5007/1//COMMIT_MSG
Commit Message:

PS1, Line 7: MS
I presume this means 'master'? Could you use 'master' instead?


PS1, Line 12: black
block


http://gerrit.cloudera.org:8080/#/c/5007/1/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

Line 199:   std::vector<string> sub_objects;
Nit: add using and drop the prefix.


PS1, Line 207: ".tmp"
Let's expose this from within FsManager so that we've got it defined in just 
one place.


Line 224:   env_util::OpenFileForWrite(fs_manager()->env(), tmp_path, 
&tmp_writer);
Should ASSERT_OK() on anything here that returns a Status.


Line 248:   ASSERT_EQ(n_tmp_files, 4);
Nit: with ASSERT_EQ(), we typically place the expected value before the actual 
value. So this would be ASSERT_EQ(4, n_tmp_files).


PS1, Line 253:   n_tmp_files = 0;
             :   for (const string& root : lookup_dirs) {
             :     vector<string> children;
             :     ASSERT_OK(fs_manager()->env()->GetChildren(root, &children));
             :     n_tmp_files += CountTmpFiles(fs_manager()->env(), root, 
children);
             :   }
             :   ASSERT_EQ(n_tmp_files, 0);
Looks like all of this could be encapsulated in a second method that calls into 
CountTmpFiles().


http://gerrit.cloudera.org:8080/#/c/5007/1/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

Line 230:   // Remove leftover tmp files
This behavior should be conditioned on !read_only_.


Line 231:   std::vector<string> children;
Nit: replace std::vector with vector and add a "using std::vector;" statement 
towards the top of the file.


Line 460: void FsManager::CleanTmpFiles(const string& path, const 
std::vector<string>& children) {
Nit: remove std:: prefix here and below too.


Line 463: 
Please add a DCHECK(!read_only_) here.


Line 468:     Status s = env_->GetChildren(sub_path, &sub_objects);
I see you're relying on GetChildren() to also tell you whether you're trying to 
recurse on a file or a directory (it'll fail if the former). Is that strictly 
better than explicitly checking whether a child entry is a 
directory/symlink_to_directory?

The advantage of an explicit check is that you can then treat GetChildren() 
failures as real failures, and LOG(WARNING) them.


Line 470:       CleanTmpFiles(sub_path, sub_objects);
I'd prefer if you didn't use functional recursion, to avoid blowing out the 
thread's stack in the event of a very large tree.

Also, can you make sure this doesn't loop forever in the event of a recursive 
symlink? You may need to keep track of paths you've cleaned to do that.


http://gerrit.cloudera.org:8080/#/c/5007/1/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

Line 241:   void CleanTmpFiles(const std::string& path, const 
std::vector<std::string>& children);
Please add a comment explaining what this does.


-- 
To view, visit http://gerrit.cloudera.org:8080/5007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I299e7b22c3f920430ff5d581cf7a507a83eb0101
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Maxim Smyatkin <smyatkinma...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

Reply via email to