Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8352 )

Change subject: tool: new actions for adding and removing data directories
......................................................................


Patch Set 2:

(3 comments)

just skimmed this and left some high-level comments

http://gerrit.cloudera.org:8080/#/c/8352/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8352/2/src/kudu/tools/kudu-tool-test.cc@2330
PS2, Line 2330:   ASSERT_STR_CONTAINS(s.ToString(), "Can't find block");
shouldn't the tablet fail because it sees that one of its disks is now missing? 
relying on 'can't find block' is a little scary, because if the block is not 
found, we might accidentally end up re-using this blockid for a new allocation, 
since we start allocating new blocks at max(block-id) + 1, right?


http://gerrit.cloudera.org:8080/#/c/8352/2/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8352/2/src/kudu/tools/tool_action_fs.cc@348
PS2, Line 348: Status AddDataDir(const RunnerContext& context) {
when adding a data dir, should we have some option to add it to existing tablet 
pathsets, so the new disk is used for new blocks? This seems to make sense in 
the world where the prior config was to spread all tablets across all disks.


http://gerrit.cloudera.org:8080/#/c/8352/2/src/kudu/tools/tool_action_fs.cc@554
PS2, Line 554: Any tablets with data on the removed data directory "
             :                         "will fail the next time the server is 
started
Is there any way we can force an interactive user to confirm a 'YES' to this? 
Otherwise I'm afraid people may not truly understand this, run the script 
across a bunch of servers, and then lose data.

Maybe we can have it look and see if there are any blocks at all in the removed 
dir, and if not, not confirm/complain? I think a relatively common use case 
here is that someone starts Kudu for the first time with the wrong set of dirs, 
doesn't create any data at all, and then restarts with a different set.

This could also later expand to an option to migrate blocks off of the removed 
disk (though I guess that will be trickier because tablet metadata also needs 
to be updated with the appropriate datadir group)



--
To view, visit http://gerrit.cloudera.org:8080/8352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ddbdd6cc6231996e7802a622a8b4691527a0643
Gerrit-Change-Number: 8352
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Mon, 23 Oct 2017 21:57:28 +0000
Gerrit-HasComments: Yes

Reply via email to