Hannah Nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: WIP KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 3: (40 comments) http://gerrit.cloudera.org:8080/#/c/14177/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14177/3//COMMIT_MSG@8 PS3, Line 8: : Created auto-rebalancer thread in background tasks of catalog manager. : Set up framework for auto-rebalancing loop. : Loop retrieves information on tservers, tables, and tablets for rebalancing. : The number of replica moves per loop iteration is controlled by a flag. : If there are placement policy violations, the current loop iteration will only : perform replica moves to reinstate the policy. : Otherwise, the auto-rebalancer will perform moves to do inter-location(cross-location), : then intra-location(by table then by tserver) rebalancing. : If the cluster is balanced, the current rebalancing cycle completes, and the : thread will sleep for an interval, controlled by another flag. > nit: I found this a little difficult to read. Could you split it up into pa Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@58 PS3, Line 58: if (cluster_) > nit: it's a good practice to use scope braces for 'if ()' clause even it's Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@61 PS3, Line 61: else return -1; > warning: do not use 'else' after 'return' [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@68 PS3, Line 68: else return -1; > warning: do not use 'else' after 'return' [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@85 PS3, Line 85: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 4; > nit: these are constants, right? Add const/constexpr if so. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@96 PS3, Line 96: 10 > 10+ seconds run-time for a test makes it a 'slow' test. Add a notion of ru Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@101 PS3, Line 101: if (i==leader_idx) ASSERT_NE(0, number_of_loop_iterations(i)); : else ASSERT_EQ(0, moves_scheduled_this_round(i)); > Would this test fail if master leadership changes while the test is running Hmm, I think it would. Is it enough to just reset the leader index then restart the for loop? ie: if (leader_changed) { leader_idx = new_leader_idx; i=0; continue; } http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@104 PS3, Line 104: > nit: an extra space Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@111 PS3, Line 111: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 3; > Add const/constexpr. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@132 PS3, Line 132: auto iterations = number_of_loop_iterations(new_leader_idx); : ASSERT_LT(0, iterations); > Might it happen that the rebalancing thread hasn't yet after master leader Would it make sense to sleep for a bit to make sure the auto-rebalancer thread has been initialized and begins iterating? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@139 PS3, Line 139: kNumTservers, kNumTablets; > These are not constants, but 'kXxx' is used only for constants. Change the Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@214 PS3, Line 214: auto-rebalancing flag is paused > nit: how does one pause a flag? How do you allow a flag to be changed via the CLI? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@241 PS3, Line 241: ASSERT_GE(moves_scheduled_before, moves_scheduled_after); > If there shouldn't be any moves, then why ASSERT_GE(), not ASSERT_EQ()? if the flag to pause auto-rebalancing is set, the loop is still active (ie checking cluster status), just not getting or executing replica moves. It's possible that while auto-rebalancing is paused, the cluster becomes balanced. In this case, the variable "moves_scheduled_this_round" will be reset to 0. So it's possible that moves_scheduled_after < moves_scheduled_before. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@243 PS3, Line 243: FLAGS_auto_rebalancing_stop_flag = orig_flag; > I don't think this is necessary -- KuduTest does that automatically using g Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h File src/kudu/master/auto_rebalancer.h: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h@81 PS3, Line 81: // Variables for testing. : int number_of_loop_iterations; : int moves_scheduled_this_round; > If this is for testing, move then under 'private' section and move those te Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h@93 PS3, Line 93: // The autorebalancing thread. : scoped_refptr<kudu::Thread> thread_; > We don't expect there to be multiple variables containing `thread_` do we? kudu::Thread::Create() returns a scoped_refptr. I don't believe unique_ptr is a compatible conversion http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h@104 PS3, Line 104: // The internal Rebalancer object. : std::shared_ptr<rebalance::Rebalancer> rebalancer_; > Same here; are there expected to be multiple threads holding onto `rebalanc Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h@106 PS3, Line 106: : // Functions for the Rebalancer object: : : // Collects information about the cluster. : Status BuildClusterRawInfo(const boost::optional<std::string>& location, : rebalance::ClusterRawInfo* raw_info) const; : : Status GetMovesUsingRebalancingAlgo( : const rebalance::ClusterRawInfo& raw_info, : std::vector<rebalance::Rebalancer::ReplicaMove>* replica_moves, : rebalance::RebalancingAlgo* algo, : bool cross_location = false) const; : : Status GetMoves(const rebalance::ClusterRawInfo& raw_info, : const rebalance::ClusterInfo& ci, : std::vector<rebalance::Rebalancer::ReplicaMove>* replica_moves, : const rebalance::TabletsPlacementInfo& placement_info, : bool* policy_fixing) const; : : Status GetTabletLeader(const std::string& tablet_id, std::string* leader_uuid, HostPort* leader_hp) const; : : Status ExecuteMoves(const std::vector<rebalance::Rebalancer::ReplicaMove>& replica_moves, const bool& policy_fixing); : : // Returns whether or not the cluster needs rebalancing. : bool IsClusterBalanced(const rebalance::TabletsPlacementInfo& placement_info, : const rebalance::ClusterInfo& ci) const; : : // Helper function to retrieve TSManager. : TSManager* GetTSManager() const; > nit: Could you move these above the private member variables? Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: PS3: > nit: got a bunch of whitespace sprinkled throughout. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@84 PS3, Line 84: using kudu::tserver::ConsensusServiceImpl; > warning: using decl 'ConsensusServiceImpl' is unused [misc-unused-using-dec Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@87 PS3, Line 87: using std::map; > warning: using decl 'map' is unused [misc-unused-using-decls] Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@96 PS3, Line 96: using std::unordered_multimap; > warning: using decl 'unordered_multimap' is unused [misc-unused-using-decls Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@104 PS3, Line 104: DEFINE_string(auto_rebalancing_ignored_tservers, "", : "UUIDs of tablet servers to ignore while auto-rebalancing the cluster " : "(comma-separated list). If specified, allow the auto-rebalancing " : "when some tablet servers in 'ignored_tservers' are unhealthy. " : "If not specified, allow the rebalancing only when all tablet " : "servers are healthy."); : : DEFINE_string(auto_rebalancing_tables, "", : "Tables to rebalance (comma-separated list of table names)" : "If not specified, includes all tables."); : > It seems like these are initialized once and cached, so it seems a bit odd Hm, I kept these from the rebalancer tool mostly to populate the Config struct, but they don't have use in the context of the auto-rebalancer. Will remove. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@119 PS3, Line 119: // A typical Kudu cluster has 5+ nodes. > What exactly is the guidance here? Should the value of this flag be correla Comment/response below http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@120 PS3, Line 120: DEFINE_uint32(auto_rebalancing_max_moves_per_iteration, 5, : "Maximum number of total replica moves to perform during one " : "iteration of the auto-rebalancing loop."); > If I'm not mistaken, in the original rebalancer code constant 5 is used as I think I could just use the same logic here--a flag to set the max_moves_per_server, and scale it by the number of nodes in the cluster to determine max_moves_per_iteration-- but maybe set the default max_moves_per_server to 2? Since the auto-rebalancer should be more conservative than the rebalancer tool. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@144 PS3, Line 144: Rebalancer::Config c = Rebalancer::Config( > nit: if you stored this as a const private member, you wouldn't have to add Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@146 PS3, Line 146: master_addresses > Does this get used anywhere? Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@190 PS3, Line 190: { : std::lock_guard<Mutex> l(lock_); : if (closing_) return; : } > If we're only locking to protect `closing_`, a spinlock might be more appro Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@195 PS3, Line 195: // If not the leader, don't do anything. : { : CatalogManager::ScopedLeaderSharedLock l(catalog_manager_); : if (!l.first_failed_status().ok()) { : number_of_loop_iterations = 0; : moves_scheduled_this_round = 0; : continue; : } : } > Rather than repeatedly checking whether we're leader (and locking in the pr Returning from this function will result in the autorebalancer thread exiting, so the thread would have to be re-initialized when the master is elected leader. (Where could this go, if implemented?) I suppose another approach is just to sleep for a bit before continuing--then the decision to be made is about how long to sleep. Do you have any thoughts? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@238 PS3, Line 238: SleepFor(MonoDelta::FromSeconds(FLAGS_auto_rebalancing_interval_seconds)); > Should we sleep for all the other continue cases? I intended for this to be the only continue case where the thread sleeps, because if the cluster is balanced, there probably won't be any rebalancing moves necessary for a while--not until something changes. The other continue cases are in response to errors, eg being unable to get/execute replica moves. I think it makes sense for the thread to sleep in these cases, but do you think it makes sense to sleep for the same interval of time as the cluster_balanced case? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@254 PS3, Line 254: : // Check flag again, in case auto-rebalancing flag has been paused. : if (FLAGS_auto_rebalancing_stop_flag) continue; > Seems like not much happened between our initial check (no RPCs or anything Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@545 PS3, Line 545: GetTSManager > Why not call catalog_manager_->ts_manager() directly? oh. good point http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@561 PS3, Line 561: // GetAllTables() requires holding leader lock. > nit: remove this Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@620 PS3, Line 620: } : : else { > nit: join these lines Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@690 PS3, Line 690: if (IsClusterOrTablesSkewed(ci)) return false; > nit: maybe write this so we don't have to have the extra else scope Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/catalog_manager.h@769 PS3, Line 769: // Functions for auto-rebalancer testing. : int auto_rebalancer_iterations() const; : int auto_rebalancer_moves() const; : : master::TSManager* ts_manager() const; > If these are for testing-only purposes, a better place for these is in the Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/catalog_manager.cc@287 PS3, Line 287: true > We should probably set this to false by default. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/ts_descriptor.h File src/kudu/master/ts_descriptor.h: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/ts_descriptor.h@160 PS3, Line 160: HostPort last_location_HP() const; > This is a misnomer; how about `last_address()` or something? Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/ts_descriptor.h@159 PS3, Line 159: // Return the TS address: the last known host/port. : HostPort last_location_HP() const; : : // Return a string form of the TS address: the last known host/port. : std::string last_location() const; > Why do we need both of these? Wouldn't it suffice to use `HostPort::ToStrin Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/ts_descriptor.cc File src/kudu/master/ts_descriptor.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/ts_descriptor.cc@301 PS3, Line 301: 0 > Can there be multiple of these? What about the other ones? If I understand the usage of ServerRegistrationPB.rpc_addresses correctly, the first one (if there are multiple) will be the most recent one. ie as in TSDescriptor::ToString() below -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 3 Gerrit-Owner: Hannah Nguyen <hannah.ngu...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hannah Nguyen <hannah.ngu...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 17 Sep 2019 01:07:52 +0000 Gerrit-HasComments: Yes