[jira] Commented: (QPID-2256) cluster_test hangs with theads deadlocked on mutex in DeletionManager.
[ https://issues.apache.org/jira/browse/QPID-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788259#action_12788259 ] Andrew Stitcher commented on QPID-2256: --- This might be related to having client and broker in the same process, as under this one case the DeletionManager would be shared. cluster_test hangs with theads deadlocked on mutex in DeletionManager. -- Key: QPID-2256 URL: https://issues.apache.org/jira/browse/QPID-2256 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.6 Reporter: Alan Conway Assignee: Alan Conway Priority: Blocker Fix For: 0.6 Running cluster_test in a loop will fairly quickly result in a deadlock. The test is blocked waiting for a child broker to exit. The broker appears deadlocked around a mutex in DeletionManager, here are the stack traces of the deadlocked broker: Thread 10 (Thread 0x414b2940 (LWP 2351)): #0 0x003da400af70 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x2af11703c21d in qpid::sys::Condition::wait (this=0x1accd4e0, mut...@0x1accd4b8, absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/posix/Condition.h:69 #2 0x2af11703c48b in qpid::sys::Monitor::wait (this=0x1accd4b8, absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/Monitor.h:45 #3 0x2af1170396ac in qpid::sys::Timer::run (this=0x1accd4b0) at ../../cpp/src/qpid/sys/Timer.cpp:139 #4 0x2af116f4f7cc in runRunnable (p=0x1accd4b0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #5 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #6 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x4217c940 (LWP 2353)): #0 0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0 #2 0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at ../../cpp/include/qpid/sys/posix/Mutex.h:116 #4 0x2af1169651e6 in ScopedLock (this=0x4217bd70, l...@0x2af11730c360) at ../../cpp/include/qpid/sys/Mutex.h:33 #5 0x2af116f5c29a in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus (this=0x2af11730c360, t=0x1acd4d00) at ../../cpp/src/qpid/sys/DeletionManager.h:148 #6 0x2af116f5c340 in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState () at ../../cpp/src/qpid/sys/DeletionManager.h:83 #7 0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488 #8 0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at ../../cpp/src/qpid/sys/Dispatcher.cpp:37 #9 0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x42b7d940 (LWP 2354)): #0 0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0 #2 0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at ../../cpp/include/qpid/sys/posix/Mutex.h:116 #4 0x2af1169651e6 in ScopedLock (this=0x42b7cd70, l...@0x2af11730c360) at ../../cpp/include/qpid/sys/Mutex.h:33 #5 0x2af116f5c29a in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus (this=0x2af11730c360, t=0x1acd6160) at ../../cpp/src/qpid/sys/DeletionManager.h:148 #6 0x2af116f5c340 in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState () at ../../cpp/src/qpid/sys/DeletionManager.h:83 #7 0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488 #8 0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at ../../cpp/src/qpid/sys/Dispatcher.cpp:37 #9 0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x4357e940 (LWP 2355)): #0 0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0 #2 0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x2af116965181 in qpid::sys::Mutex::lock (this=0x1acd4bc0) at
[jira] Commented: (QPID-2256) cluster_test hangs with theads deadlocked on mutex in DeletionManager.
[ https://issues.apache.org/jira/browse/QPID-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788275#action_12788275 ] Alan Conway commented on QPID-2256: --- There is a lock hierarchy error causing the deadlock as follows # threads 1-6,8,9 DeletionManager::destroyThreadState threadStatus-lock AllThreadsStatuses::delThreadStatus lock # thread 7 AllThreadsStatuses::handleAdder ptr-lock AllThreadsStatuses::addHandle lock ptr-lock And threadStatus == ptr threadStatus == ptr and the same AllThreadsStatuses instance is involved in all cases. cluster_test hangs with theads deadlocked on mutex in DeletionManager. -- Key: QPID-2256 URL: https://issues.apache.org/jira/browse/QPID-2256 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.6 Reporter: Alan Conway Assignee: Alan Conway Priority: Blocker Fix For: 0.6 Running cluster_test in a loop will fairly quickly result in a deadlock. The test is blocked waiting for a child broker to exit. The broker appears deadlocked around a mutex in DeletionManager, here are the stack traces of the deadlocked broker: Thread 10 (Thread 0x414b2940 (LWP 2351)): #0 0x003da400af70 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x2af11703c21d in qpid::sys::Condition::wait (this=0x1accd4e0, mut...@0x1accd4b8, absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/posix/Condition.h:69 #2 0x2af11703c48b in qpid::sys::Monitor::wait (this=0x1accd4b8, absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/Monitor.h:45 #3 0x2af1170396ac in qpid::sys::Timer::run (this=0x1accd4b0) at ../../cpp/src/qpid/sys/Timer.cpp:139 #4 0x2af116f4f7cc in runRunnable (p=0x1accd4b0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #5 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #6 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x4217c940 (LWP 2353)): #0 0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0 #2 0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at ../../cpp/include/qpid/sys/posix/Mutex.h:116 #4 0x2af1169651e6 in ScopedLock (this=0x4217bd70, l...@0x2af11730c360) at ../../cpp/include/qpid/sys/Mutex.h:33 #5 0x2af116f5c29a in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus (this=0x2af11730c360, t=0x1acd4d00) at ../../cpp/src/qpid/sys/DeletionManager.h:148 #6 0x2af116f5c340 in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState () at ../../cpp/src/qpid/sys/DeletionManager.h:83 #7 0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488 #8 0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at ../../cpp/src/qpid/sys/Dispatcher.cpp:37 #9 0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x42b7d940 (LWP 2354)): #0 0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0 #2 0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at ../../cpp/include/qpid/sys/posix/Mutex.h:116 #4 0x2af1169651e6 in ScopedLock (this=0x42b7cd70, l...@0x2af11730c360) at ../../cpp/include/qpid/sys/Mutex.h:33 #5 0x2af116f5c29a in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus (this=0x2af11730c360, t=0x1acd6160) at ../../cpp/src/qpid/sys/DeletionManager.h:148 #6 0x2af116f5c340 in qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState () at ../../cpp/src/qpid/sys/DeletionManager.h:83 #7 0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488 #8 0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at ../../cpp/src/qpid/sys/Dispatcher.cpp:37 #9 0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at ../../cpp/src/qpid/sys/posix/Thread.cpp:35 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x4357e940 (LWP 2355)): #0 0x003da400d2e4 in __lll_lock_wait ()