Hello Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/4614

to review the following change.

Change subject: mem_tracker: fix race between FindTracker() and destructor
......................................................................

mem_tracker: fix race between FindTracker() and destructor

In very rare cases, it's possible for an interleaving between these two
functions to lead to a recursive lock acquisition of child_trackers_lock_
in the destructor.

For example:
1. The tracker hierarchy contains one parent (P) and one child (C1).
2. Thread 1 creates a second child (C2) parented to P. It has the sole ref
   to C2.
3. Thread 2 calls FindTracker() looking for C1.
4. Thread 2 runs as far as the loop in FindTrackerUnlocked(), getting
   descheduled just as it has locked a ref to C2. It also holds P's
   child_trackers_lock_.
5. Thread 1 is rescheduled and drops its ref to C2.
6. Thread 2 is rescheduled. It also drops its ref to C2, which was the last
   ref, so it runs C2's destructor. This acquires P's child_trackers_lock_,
   which it already owns. Boom.

The fix is simple: when looking up a tracker, make a local copy of the list
of children while holding child_trackers_lock_, then iterate without it.
It gets a little complicated due to the unusual requirements of
FindOrCreateTracker(); I ended up introducing a special static lock to
handle that case. The rest of the changes are basically code motion.

Unfortunately, the new test does not fail reliably without the fix. In my
testing, it failed maybe once every few hundred runs. But it doesn't fail at
all with the fix in place.

Change-Id: I69610f782c7a00d161bfb48d2487c29c0309c985
---
M src/kudu/consensus/log_cache.cc
M src/kudu/util/cache.cc
M src/kudu/util/mem_tracker-test.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
5 files changed, 85 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/4614/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I69610f782c7a00d161bfb48d2487c29c0309c985
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to