Adar Dembo has submitted this change and it was merged.

Change subject: raft_consensus-itest: don't assume SIGSTOP is synchronous
......................................................................


raft_consensus-itest: don't assume SIGSTOP is synchronous

While looping raft_consensus-itest in slow mode 1000 times, I saw the
following failure once:

  I0801 03:05:02.165644 24426 raft_consensus-itest.cc:1682] Pausing 2 tablet 
servers in config of size 3
  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:1702: 
Failure
  Value of: s.IsTimedOut()
    Actual: false
  Expected: true
  OK
  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:1814: 
Failure
  Expected: AssertMajorityRequiredForElectionsAndWrites(active_tablet_servers, 
leader_uuid) doesn't generate new fatal failures in the current thread.
    Actual: it does.

One explanation is: since the SIGSTOP sent by Pause() is delivered
asynchronously, it's possible for the write issued by the test (after
Pause()) to be handled before a majority of replicas are actually paused.

Change-Id: Id3202378a0e03a1bb29f32993498399e67b584d5
Reviewed-on: http://gerrit.cloudera.org:8080/7557
Reviewed-by: Todd Lipcon <[email protected]>
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/raft_consensus-itest.cc
1 file changed, 10 insertions(+), 5 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id3202378a0e03a1bb29f32993498399e67b584d5
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to