Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10237 )
Change subject: [tests] fixed flake in consensus_peer_health_status ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10237/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10237/1//COMMIT_MSG@11 PS1, Line 11: happened when the target tablet server was shutdown during an on-going : tablet copy. In that situation, the source tablet server had : c > curious why you took this approach instead of just waiting for the tablet c Ah, it seems I omitted some useful information on where the flake happened, I'll add that. The flake happened in the last line of the test scenario, and , as I understand, the tablet copy that hold anchor as triggered by AddServer() at line 197. Yes, I tried ASSERT_OK(WaitUntilTabletRunning(follower_ts, tablet_id_, kTimeout)) instead of NO_FATALS(wait_for_health_state(leader_ts, follower_uuid, HEALTHY)) at line 198 and it also worked. I thought maybe it's worth keeping this interesting situation with hung tablet copy (even if it's rare). >From the other side, if we want to be explicit and just start from scratch, >then WaitUntilTabletRunning() after AddServer() is the best approach, I think. > All right, I'll change to that one. -- To view, visit http://gerrit.cloudera.org:8080/10237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8eb640604e98361029aa3342ffa3050e922b6629 Gerrit-Change-Number: 10237 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Mon, 30 Apr 2018 18:09:01 +0000 Gerrit-HasComments: Yes