Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10399 )
Change subject: [tools] rebalancer in the kudu CLI tool ...................................................................... Patch Set 15: (17 comments) I am still working through this patch, but I thought I'd leave some partial feedback. http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h File src/kudu/tools/rebalance.h: PS15: nit: consider naming this rebalancer.h since it has the Rebalancer class http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@129 PS15, Line 129: results for the next step nit: using the term "step" here is confusing because it sounds like it's for the next run of RunStep() http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@152 PS15, Line 152: std::unordered_map<std::string, std::set<size_t>>& op_indices, mutable ref violates style guide http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@171 PS15, Line 171: bool UpdateMovesInProgress(const MonoTime& deadline, bool* timed_out); Needs doc, including the API contract http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.h@181 PS15, Line 181: Helper containers used to run rebalancing. This comment is very vague. Please add info about key -> value for the containers and what they are tracking for all the helper containers including the op count stuff http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc File src/kudu/tools/rebalance.cc: http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@53 PS15, Line 53: false true seems like a more desirable default in a 3-4-3 cluster. http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@56 PS15, Line 56: rebalance_output_replica_distribution_details can we just make this a --verbose or -v argument to the kudu cluster rebalance command? http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@238 PS15, Line 238: if (clear_scheduled_moves) { A comment here would be nice to indicate when we want to clear scheduled moves http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@239 PS15, Line 239: loger typo http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@250 PS15, Line 250: DCHECK nit: DCHECK_EQ("", FindNonUniqueTablet(replica_moves)) would print the offending tablet if this ever triggered http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@479 PS15, Line 479: RunStep I think RunStep() might be better named GetNextMoveSet() or NextMoves() or NextBatch() or NextBatchOfMoves() ... because it's not actually executing the moves. http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@479 PS15, Line 479: pending_moves why pass pending_moves into this function instead of just directly referencing scheduled_moves_ from within it? http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@559 PS15, Line 559: constraint: it's better not to move leader replica Can you clarify whether this means we prefer to move other replicas before the leader, or whether this means we will never move a leader? http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/rebalance.cc@660 PS15, Line 660: src_op_indices_.emplace(elem.ts_uuid_from, set<size_t>()).first-> : second.emplace(i); I'm having trouble understanding why you are adding i to the set here. http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_action_cluster.cc File src/kudu/tools/tool_action_cluster.cc: http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_action_cluster.cc@212 PS15, Line 212: rebalance_ nit: consider removing the rebalance_ prefix from these command-line flags, since they are only exposed when you type "kudu cluster rebalance" anyway. http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h File src/kudu/tools/tool_replica_util.h: http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h@56 PS15, Line 56: Make nit: s/Make/Request/ http://gerrit.cloudera.org:8080/#/c/10399/15/src/kudu/tools/tool_replica_util.h@78 PS15, Line 78: // can be null. nit: doc the contract for completion_status and Status return from this function. Why do we have separate variables for is_complete and completion_status? -- To view, visit http://gerrit.cloudera.org:8080/10399 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I269ea1dcb0b528ad9f03308bac6b8769e2141238 Gerrit-Change-Number: 10399 Gerrit-PatchSet: 15 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 01 Jun 2018 01:02:44 +0000 Gerrit-HasComments: Yes
