Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10399 )

Change subject: [tools] rebalancer in the kudu CLI tool
......................................................................


Patch Set 15:

(17 comments)

I am still working through this patch, but I thought I'd leave some partial 
feedback.

http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h
File src/kudu/tools/rebalance.h:

PS15:
nit: consider naming this rebalancer.h since it has the Rebalancer class


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@129
PS15, Line 129: results for the next step
nit: using the term "step" here is confusing because it sounds like it's for 
the next run of RunStep()


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@152
PS15, Line 152:       std::unordered_map<std::string, std::set<size_t>>& 
op_indices,
mutable ref violates style guide


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@171
PS15, Line 171:   bool UpdateMovesInProgress(const MonoTime& deadline, bool* 
timed_out);
Needs doc, including the API contract


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@181
PS15, Line 181: Helper containers used to run rebalancing.
This comment is very vague. Please add info about key -> value for the 
containers and what they are tracking for all the helper containers including 
the op count stuff


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc
File src/kudu/tools/rebalance.cc:

http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@53
PS15, Line 53: false
true seems like a more desirable default in a 3-4-3 cluster.


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@56
PS15, Line 56: rebalance_output_replica_distribution_details
can we just make this a --verbose or -v argument to the kudu cluster rebalance 
command?


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@238
PS15, Line 238:     if (clear_scheduled_moves) {
A comment here would be nice to indicate when we want to clear scheduled moves


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@239
PS15, Line 239: loger
typo


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@250
PS15, Line 250: DCHECK
nit: DCHECK_EQ("", FindNonUniqueTablet(replica_moves)) would print the 
offending tablet if this ever triggered


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@479
PS15, Line 479: RunStep
I think RunStep() might be better named GetNextMoveSet() or NextMoves() or 
NextBatch() or NextBatchOfMoves() ... because it's not actually executing the 
moves.


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@479
PS15, Line 479: pending_moves
why pass pending_moves into this function instead of just directly referencing 
scheduled_moves_ from within it?


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@559
PS15, Line 559: constraint: it's better not to move leader replica
Can you clarify whether this means we prefer to move other replicas before the 
leader, or whether this means we will never move a leader?


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@660
PS15, Line 660:     src_op_indices_.emplace(elem.ts_uuid_from, 
set<size_t>()).first->
              :         second.emplace(i);
I'm having trouble understanding why you are adding i to the set here.


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_action_cluster.cc
File src/kudu/tools/tool_action_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_action_cluster.cc@212
PS15, Line 212: rebalance_
nit: consider removing the rebalance_ prefix from these command-line flags, 
since they are only exposed when you type "kudu cluster rebalance" anyway.


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h
File src/kudu/tools/tool_replica_util.h:

http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h@56
PS15, Line 56: Make
nit: s/Make/Request/


http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h@78
PS15, Line 78: // can be null.
nit: doc the contract for completion_status and Status return from this 
function.

Why do we have separate variables for is_complete and completion_status?



--
To view, visit http://gerrit.cloudera.org:8080/10399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I269ea1dcb0b528ad9f03308bac6b8769e2141238
Gerrit-Change-Number: 10399
Gerrit-PatchSet: 15
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Fri, 01 Jun 2018 01:02:44 +0000
Gerrit-HasComments: Yes

Reply via email to