Hello Mike Percy,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/2479
to look at the new patch set (#2).
Change subject: client_failover-itest: fix flakiness with opid mismatches
......................................................................
client_failover-itest: fix flakiness with opid mismatches
This fixes a common source of flakiness, particular in TSAN builds. The issue
was that we were assuming that, if the TestWorkload wrote N batches, that would
correspond exactly to N log operations on the server side. That actually isn't
the case -- there are some interleavings in which the client 'Batcher' can
split a single Flush call into multiple RPCs, and we don't make any strong
guarantees that a Flush is atomic, even though it is almost all the time.
The fix is simple: switch to single-row batches, which can't be split up
into the client.
On an earlier revision of this patch, I was able to run the
DeleteLeaderWhileScanning tests 5000 times in TSAN with only a few failures[1].
I ran 1000 on the latest revision[2]. The remaining failures seem to be an
unrelated data race on RaftConsensus shutdown.
[1] http://dist-test.cloudera.org/job?job_id=todd.1457413245.29963
[2] http://dist-test.cloudera.org/job?job_id=todd.1457464975.21171
Change-Id: Ib3df1b3f5b0903f069a5e7ae3ba2a64c1c52a427
---
M src/kudu/integration-tests/client_failover-itest.cc
M src/kudu/integration-tests/test_workload.h
2 files changed, 10 insertions(+), 0 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/2479/2
--
To view, visit http://gerrit.cloudera.org:8080/2479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3df1b3f5b0903f069a5e7ae3ba2a64c1c52a427
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>