Hello Mike Percy, helifu,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/12254

to review the following change.


Change subject: KUDU-2665: deflake block_manager-stress-test
......................................................................

KUDU-2665: deflake block_manager-stress-test

After commit 0c501979b was merged, this test became really flaky (like 50%
flaky in some environments). I think it's due to the new nature of the log
block manager, which may now delete dead containers in the background.

Specifically, if two transactions delete the last blocks from a full
container, it's possible for one to get scheduled for an (asynchronous) hole
punch, and for the other to set the container as dead. Later, when the
hole punch runs, the container's last ref will be dropped, causing the dead
container to be deleted.

While perhaps surprising, this new behavior is desirable, and it's now
incorrect to assume that a cessation in user threads implies an end to LBM
activity. block_manager-stress-test makes this assumption by using the
LBMCorruptor to inject inconsistencies after test threads have been joined.
To fix, we must explicitly quiesce the LBM; destroying it will do the trick.

What is surprising is that, for the life of me, I can't reproduce the
failure. Not locally, not on a CentOS 6.6 machine, not looped in dist-test
with stress threads, not ever. I even tried adding some "creative" sleep
calls in a few places to tickle the race, to no avail.

Change-Id: I0be328f740056cd6b64c9881759225c8b961a935
---
M src/kudu/fs/block_manager-stress-test.cc
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/12254/1
--
To view, visit http://gerrit.cloudera.org:8080/12254
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0be328f740056cd6b64c9881759225c8b961a935
Gerrit-Change-Number: 12254
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>

Reply via email to