Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10237 )

Change subject: [tests] fixed flake in consensus_peer_health_status
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10237/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10237/1//COMMIT_MSG@11
PS1, Line 11: happened when the target tablet server was shutdown during an 
on-going
            : tablet copy.  In that situation, the source tablet server had
            : c
> curious why you took this approach instead of just waiting for the tablet c
Ah, it seems I omitted some useful information on where the flake happened, 
I'll add that.

The flake happened in the last line of the test scenario, and , as I 
understand, the tablet copy that hold anchor as triggered by AddServer() at 
line 197.

Yes, I tried ASSERT_OK(WaitUntilTabletRunning(follower_ts, tablet_id_, 
kTimeout)) instead of NO_FATALS(wait_for_health_state(leader_ts, follower_uuid, 
HEALTHY)) at line 198 and it also worked.  I thought maybe it's worth keeping 
this interesting situation with hung tablet copy (even if it's rare).

>From the other side, if we want to be explicit and just start from scratch, 
>then WaitUntilTabletRunning() after AddServer() is the best approach, I think. 
> All right, I'll change to that one.



--
To view, visit http://gerrit.cloudera.org:8080/10237
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8eb640604e98361029aa3342ffa3050e922b6629
Gerrit-Change-Number: 10237
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Mon, 30 Apr 2018 18:09:01 +0000
Gerrit-HasComments: Yes

Reply via email to