NOTE: mark patchset RFC because "make check" will easily fail; but I didn't
yet dig into why as I'm not familiar with the code paths that triggers, it
can be bugs hidden or something I missed.  So RFC to just have some thoughts.

The first patch converts the new timedwait to use DEBUG_MUTEX paths too.
IMO this one is pretty much wanted.  The second patch add a strict version
of pthread_mutex_unlock() check by making sure the lock is locked first.

This comes from a debugging of migration code where we have had functions
like:

  /* func() must be with lockA held */
  func() {
    ...
    /* Temporarily release the lock */
    qemu_mutex_unlock(lockA);
    ...
    /* Retake the lock */
    qemu_mutex_lock(lockA);
    ...
  }

Since I found that pthread lib is very "friendly" to unpaired unlock and
just silently ignore it, returning 0 as succeed.  It means when func() is
called without lockA held the unlock() above will be ignored, but the
follow up lock() will be real.  Later it will easily cause deadlock
afterwards in func() above because they just don't pair anymore right after
the one ignored unlock().

Since it's harder to know where should we take the lock, it's still easily
to fail the unlock() upon a lock not being held at all, so it's at least
earlier than the deadlock later on lockA in some other thread.

Patch 2 can also be used to further replace [sg]et_iothread_locked(), I
think, then we need to move the "locked" to be outside DEBUG_MUTEX but only
keep the assert() inside.  But we can discuss that later.

Comments welcomed, thanks.

Peter Xu (2):
  qemu-thread: Enable the new timedwait to use DEBUG_MUTEX too
  qemu-thread: Fail hard for suspecious mutex unlocks

 include/qemu/thread-posix.h |  1 +
 util/qemu-thread-common.h   | 10 ++++++++++
 util/qemu-thread-posix.c    |  4 ++--
 3 files changed, 13 insertions(+), 2 deletions(-)

-- 
2.37.3


Reply via email to