Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17089 )
Change subject: [test] Fix flakiness with detecting a master in FAILED_UNRECOVERABLE state ...................................................................... Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/17089/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17089/1//COMMIT_MSG@28 PS1, Line 28: This change also includes fixes where master index 0 was used : and it'd be good to explicitly use the leader/follower index : though they are not strictly necessary. > I'm not sure I agree with this. Doesn't this reduce test coverage of runnin See comments below. http://gerrit.cloudera.org:8080/#/c/17089/1/src/kudu/master/dynamic_multi_master-test.cc File src/kudu/master/dynamic_multi_master-test.cc: http://gerrit.cloudera.org:8080/#/c/17089/1/src/kudu/master/dynamic_multi_master-test.cc@171 PS1, Line 171: vector<int64_t> updated_gc_count(orig_num_masters_); : ASSERT_EQ(orig_gc_count.size(), updated_gc_count.size()); > nit: why do we need to keep track of these? Why not compare get_sys_catalog Good catch, not needed. http://gerrit.cloudera.org:8080/#/c/17089/1/src/kudu/master/dynamic_multi_master-test.cc@778 PS1, Line 778: ASSERT_OK(cluster_->GetLeaderMasterIndex(&leader_idx)); > Do we need to disable leadership elections here? Is it actually critical to It's not necessary to copy from the leader, copying from follower will be fine as well. Reverting. http://gerrit.cloudera.org:8080/#/c/17089/1/src/kudu/master/dynamic_multi_master-test.cc@1171 PS1, Line 1171: ASSERT_OK(GetFollowerMasterIndex(&non_leader_idx)); With the current order of validation checks issuing this request to leader master would also result in the same error message. However for some reason if the order of validation for leader master is done first then error would be different resulting in test failure. > Do we need to disable leadership elections here? Done. http://gerrit.cloudera.org:8080/#/c/17089/1/src/kudu/master/dynamic_multi_master-test.cc@1172 PS1, Line 1172: auto master_to_remove = cluster_->master(non_leader_idx)->bound_rpc_hostport(); > warning: 1st function call argument is an uninitialized value [clang-analyz This warning is not consistent but will fix it. -- To view, visit http://gerrit.cloudera.org:8080/17089 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id6017f1601eaed22be28c8a5babd6e35e93b1d2e Gerrit-Change-Number: 17089 Gerrit-PatchSet: 1 Gerrit-Owner: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 19 Feb 2021 23:20:45 +0000 Gerrit-HasComments: Yes
