Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24085 )

Change subject: KUDU-3729 Prefer follower moves in auto-rebalancing
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer-test.cc@269
PS3, Line 269:   static Status BuildClusterInfoForTest(
BuildClusterInfoForTest always passes an empty MovesInProgress. The new 
ts_with_followers_by_table_and_tag population sits inside the do_count_replica 
guard, so a replica that is the source of an in-progress move should be 
excluded from the follower map. Could we add a case here that passes a fake 
in-progress move and checks the source tserver is absent from the map for that 
{table_id, tag}?


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer-test.cc@1181
PS3, Line 1181: TEST_F(AutoRebalancerTest, TestRebalancingAvoidsLeaderMoves) {
ts_with_followers_by_table_and_tag is populated in BuildClusterInfo but there's 
no test that verifies this from a live cluster. Could we add a case using 
BuildClusterRawInfoForTest + BuildClusterInfoForTest that checks the map is 
non-empty after data is loaded?


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer-test.cc@1204
PS3, Line 1204: CheckNoLeaderMovesScheduled
The test's name and its CheckNoLeaderMovesScheduled() call are mismatched. 
CheckNoLeaderMovesScheduled() verifies that the leader-rebalancing thread (a 
separate feature) has 0 moves scheduled.


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer.cc@151
PS3, Line 151: auto_rebalancing_avoid_leader_moves
The flag=false branch has zero integration coverage. The counterpart to the 
existing TestRebalancingAvoidsLeaderMoves simply sets 
FLAGS_auto_rebalancing_avoid_leader_moves = false, adds a tserver, and asserts 
moves are still scheduled. Importantly it documents that disabling the flag 
doesn't break rebalancing


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/master/auto_rebalancer.cc@158
PS3, Line 158: TAG_FLAG(auto_rebalancing_avoid_leader_moves, runtime);
nit: add a new line here.


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/rebalance/rebalance_algo-test.cc
File src/kudu/rebalance/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/rebalance/rebalance_algo-test.cc@342
PS3, Line 342: TEST(RebalanceAlgoUnitTest, PreferFollowerMovesByTableAndTag) {
The existing test only covers the non_leader_move branch (source has a follower 
-> pick it). The leader_fallback branch (all tables have only leaders on the 
source) is not tested at all. If ts_with_followers_by_table_and_tag is empty, 
with prefer_follower_moves=true, the algorithm must still produce a move.


http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/rebalance/rebalance_algo.cc
File src/kudu/rebalance/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/24085/3/src/kudu/rebalance/rebalance_algo.cc@375
PS3, Line 375:         break;
Can you please add a comment about why the early break on non_leader_move is 
safe?



--
To view, visit http://gerrit.cloudera.org:8080/24085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0b56ae5ef8db5fe15ab25df1d1cebe84fd3b8f2c
Gerrit-Change-Number: 24085
Gerrit-PatchSet: 3
Gerrit-Owner: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Comment-Date: Wed, 25 Mar 2026 14:33:30 +0000
Gerrit-HasComments: Yes

Reply via email to