Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10399 )
Change subject: WIP [tools] first draft of rebalancing in kudu CLI ...................................................................... Patch Set 4: (21 comments) Some random comments. I didn't look at rebalancer.cc since IIRC you've already changed it a lot while testing compared to what's posted here. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance-test.cc File src/kudu/tools/rebalance-test.cc: http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance-test.cc@140 PS4, Line 140: Convert nit: I think you mean "Test converting..." http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance-test.cc@165 PS4, Line 165: "tablet_a_0" These should all be the same for one tablet. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance-test.cc@240 PS4, Line 240: { 2, { "table_c", Can we put things in order to match the table order in the KsckResultsInput? http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h File src/kudu/tools/rebalance.h: http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@36 PS4, Line 36: ReplicaMoveInfo Do you think we should synchronize the naming between this and TableReplicaMove? TabletReplicaMove? The former is a move of some replica of a table; the latter is a move of a uniquely-identified replica of a tablet. This 'engine' part must choose a specific ReplicaMoveInfo consistent with a TableReplicaMove handed to it by the algorithm. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@37 PS4, Line 37: tablet_uuid We usually call it a tablet_id and not a tablet_uuid, just like we usually say ts_uuid and not ts_id. This is more me realizing this and scratching my chin about it than suggesting you change anything :) http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@45 PS4, Line 45: step What's one step? http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@46 PS4, Line 46: max_moves Also can you doc the parameters? http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@49 PS4, Line 49: the stats Can you say which stats are included? Also, where are we printing to? In my experience it's a good idea to have an 'out' parameter. It makes the method more flexible and easier to test. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@59 PS4, Line 59: UUID UUIDs http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@59 PS4, Line 59: is output into nit: This wording leaves it unclear to me whether it replaces the contents of 'tablet_ids' with the output, or whether it appends it. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@60 PS4, Line 60: found nit: are found http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@60 PS4, Line 60: the 'tablet_ids' is empty OK this makes it clear the results are populated into 'tablet_ids'. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@60 PS4, Line 60: is nit: will be http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/rebalance.h@60 PS4, Line 60: the nit: remove http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc File src/kudu/tools/tool_action_cluster.cc: http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@62 PS4, Line 62: perfectly nit: I don't like this extra word because I don't think it means anything- we don't actually have a distinction between "balanced" and "perfectly balanced". http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@64 PS4, Line 64: max_moves_per_step Is this parameter also secretly the max number of moves that can be going on at one time (the size of the 'pool')? I think users would want to know how to limit that, more than the number of moves per step. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@119 PS4, Line 119: LOG(INFO) << "rebalancing is complete: no moves left"; nit: We should straight print this rather than log it. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@125 PS4, Line 125: "tablet $0: moving replica from $1 to $2" As we talked about, I think a brief notation for a move, with the table name included, would be ideal: Table 'foo': T <tablet id>: TS <ts uuid> -> TS <td uuid> We should also be careful to keep the output relatively easily parsed by machine. Could be useful someday. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@127 PS4, Line 127: LOG(INFO) << info_msg I like using LOG for the timestamps, but other than that I usually think of tool output not being LOG(INFO) unless the user can kinda ignore it. The main output will go to stdout and the log messages to stderr, so they can be separated. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_cluster.cc@135 PS4, Line 135: LOG(INFO) Likewise. http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_tablet.cc File src/kudu/tools/tool_action_tablet.cc: http://gerrit.cloudera.org:8080/#/c/10399/4/src/kudu/tools/tool_action_tablet.cc@413 PS4, Line 413: Should be 4 spaces, not 6. -- To view, visit http://gerrit.cloudera.org:8080/10399 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I269ea1dcb0b528ad9f03308bac6b8769e2141238 Gerrit-Change-Number: 10399 Gerrit-PatchSet: 4 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Sun, 20 May 2018 07:24:30 +0000 Gerrit-HasComments: Yes
