[jira] Commented: (QPID-2256) cluster_test hangs with theads deadlocked on mutex in DeletionManager.

2009-12-09 Thread Andrew Stitcher (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788259#action_12788259
 ] 

Andrew Stitcher commented on QPID-2256:
---

This might be related to having client and broker in the same process, as under 
this one case the DeletionManager would be shared.

 cluster_test hangs with theads deadlocked on mutex in DeletionManager.
 --

 Key: QPID-2256
 URL: https://issues.apache.org/jira/browse/QPID-2256
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.6
Reporter: Alan Conway
Assignee: Alan Conway
Priority: Blocker
 Fix For: 0.6


 Running cluster_test in a loop will fairly quickly result in a deadlock. The 
 test is blocked waiting for a child broker to exit. The  broker appears 
 deadlocked around a mutex in DeletionManager, here are the stack traces of 
 the deadlocked broker:
 Thread 10 (Thread 0x414b2940 (LWP 2351)):
 #0  0x003da400af70 in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
 /lib64/libpthread.so.0
 #1  0x2af11703c21d in qpid::sys::Condition::wait (this=0x1accd4e0, 
 mut...@0x1accd4b8, absoluteti...@0x1acccae0) at 
 ../../cpp/include/qpid/sys/posix/Condition.h:69
 #2  0x2af11703c48b in qpid::sys::Monitor::wait (this=0x1accd4b8, 
 absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/Monitor.h:45
 #3  0x2af1170396ac in qpid::sys::Timer::run (this=0x1accd4b0) at 
 ../../cpp/src/qpid/sys/Timer.cpp:139
 #4  0x2af116f4f7cc in runRunnable (p=0x1accd4b0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #5  0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #6  0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 9 (Thread 0x4217c940 (LWP 2353)):
 #0  0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0
 #2  0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at 
 ../../cpp/include/qpid/sys/posix/Mutex.h:116
 #4  0x2af1169651e6 in ScopedLock (this=0x4217bd70, l...@0x2af11730c360) 
 at ../../cpp/include/qpid/sys/Mutex.h:33
 #5  0x2af116f5c29a in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus
  (this=0x2af11730c360, t=0x1acd4d00)
 at ../../cpp/src/qpid/sys/DeletionManager.h:148
 #6  0x2af116f5c340 in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState
  () at ../../cpp/src/qpid/sys/DeletionManager.h:83
 #7  0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at 
 ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488
 #8  0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/Dispatcher.cpp:37
 #9  0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 8 (Thread 0x42b7d940 (LWP 2354)):
 #0  0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0
 #2  0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at 
 ../../cpp/include/qpid/sys/posix/Mutex.h:116
 #4  0x2af1169651e6 in ScopedLock (this=0x42b7cd70, l...@0x2af11730c360) 
 at ../../cpp/include/qpid/sys/Mutex.h:33
 #5  0x2af116f5c29a in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus
  (this=0x2af11730c360, t=0x1acd6160)
 at ../../cpp/src/qpid/sys/DeletionManager.h:148
 #6  0x2af116f5c340 in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState
  () at ../../cpp/src/qpid/sys/DeletionManager.h:83
 #7  0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at 
 ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488
 #8  0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/Dispatcher.cpp:37
 #9  0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 7 (Thread 0x4357e940 (LWP 2355)):
 #0  0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0
 #2  0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x2af116965181 in qpid::sys::Mutex::lock (this=0x1acd4bc0) at 
 

[jira] Commented: (QPID-2256) cluster_test hangs with theads deadlocked on mutex in DeletionManager.

2009-12-09 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788275#action_12788275
 ] 

Alan Conway commented on QPID-2256:
---

There is a lock hierarchy error causing the deadlock as follows

   # threads 1-6,8,9
   DeletionManager::destroyThreadState threadStatus-lock
   AllThreadsStatuses::delThreadStatus lock

   # thread 7
   AllThreadsStatuses::handleAdder ptr-lock
   AllThreadsStatuses::addHandle lock
   ptr-lock

  And  threadStatus == ptr
   threadStatus == ptr and the same AllThreadsStatuses instance is involved in 
all cases.


 cluster_test hangs with theads deadlocked on mutex in DeletionManager.
 --

 Key: QPID-2256
 URL: https://issues.apache.org/jira/browse/QPID-2256
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.6
Reporter: Alan Conway
Assignee: Alan Conway
Priority: Blocker
 Fix For: 0.6


 Running cluster_test in a loop will fairly quickly result in a deadlock. The 
 test is blocked waiting for a child broker to exit. The  broker appears 
 deadlocked around a mutex in DeletionManager, here are the stack traces of 
 the deadlocked broker:
 Thread 10 (Thread 0x414b2940 (LWP 2351)):
 #0  0x003da400af70 in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
 /lib64/libpthread.so.0
 #1  0x2af11703c21d in qpid::sys::Condition::wait (this=0x1accd4e0, 
 mut...@0x1accd4b8, absoluteti...@0x1acccae0) at 
 ../../cpp/include/qpid/sys/posix/Condition.h:69
 #2  0x2af11703c48b in qpid::sys::Monitor::wait (this=0x1accd4b8, 
 absoluteti...@0x1acccae0) at ../../cpp/include/qpid/sys/Monitor.h:45
 #3  0x2af1170396ac in qpid::sys::Timer::run (this=0x1accd4b0) at 
 ../../cpp/src/qpid/sys/Timer.cpp:139
 #4  0x2af116f4f7cc in runRunnable (p=0x1accd4b0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #5  0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #6  0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 9 (Thread 0x4217c940 (LWP 2353)):
 #0  0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0
 #2  0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at 
 ../../cpp/include/qpid/sys/posix/Mutex.h:116
 #4  0x2af1169651e6 in ScopedLock (this=0x4217bd70, l...@0x2af11730c360) 
 at ../../cpp/include/qpid/sys/Mutex.h:33
 #5  0x2af116f5c29a in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus
  (this=0x2af11730c360, t=0x1acd4d00)
 at ../../cpp/src/qpid/sys/DeletionManager.h:148
 #6  0x2af116f5c340 in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState
  () at ../../cpp/src/qpid/sys/DeletionManager.h:83
 #7  0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at 
 ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488
 #8  0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/Dispatcher.cpp:37
 #9  0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 8 (Thread 0x42b7d940 (LWP 2354)):
 #0  0x003da400d2e4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x003da4008c55 in _L_lock_1127 () from /lib64/libpthread.so.0
 #2  0x003da4008b53 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x2af116965181 in qpid::sys::Mutex::lock (this=0x2af11730c360) at 
 ../../cpp/include/qpid/sys/posix/Mutex.h:116
 #4  0x2af1169651e6 in ScopedLock (this=0x42b7cd70, l...@0x2af11730c360) 
 at ../../cpp/include/qpid/sys/Mutex.h:33
 #5  0x2af116f5c29a in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::AllThreadsStatuses::delThreadStatus
  (this=0x2af11730c360, t=0x1acd6160)
 at ../../cpp/src/qpid/sys/DeletionManager.h:148
 #6  0x2af116f5c340 in 
 qpid::sys::DeletionManagerqpid::sys::PollerHandlePrivate::destroyThreadState
  () at ../../cpp/src/qpid/sys/DeletionManager.h:83
 #7  0x2af116f56b52 in qpid::sys::Poller::run (this=0x1acc9b50) at 
 ../../cpp/src/qpid/sys/epoll/EpollPoller.cpp:488
 #8  0x2af1170379c2 in qpid::sys::Dispatcher::run (this=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/Dispatcher.cpp:37
 #9  0x2af116f4f7cc in runRunnable (p=0x7fff608801d0) at 
 ../../cpp/src/qpid/sys/posix/Thread.cpp:35
 #10 0x003da4006617 in start_thread () from /lib64/libpthread.so.0
 #11 0x003da34d3c2d in clone () from /lib64/libc.so.6
 Thread 7 (Thread 0x4357e940 (LWP 2355)):
 #0  0x003da400d2e4 in __lll_lock_wait ()