Jason Dillaman created QPID-4256:
------------------------------------

             Summary: HA failover caused by unresponsive broker during queue 
cleaner invocation
                 Key: QPID-4256
                 URL: https://issues.apache.org/jira/browse/QPID-4256
             Project: Qpid
          Issue Type: Bug
          Components: C++ Broker
    Affects Versions: 0.18
            Reporter: Jason Dillaman


With the ring queue policy and tens of thousands of messages in a queue, the HA 
primary broker can become unresponsive for long enough to cause a failover.  
Since the queue cleaner owns the lock on the queue for the length of the 
cleaning, it is possible to deadlock all other worker threads if they are 
attempting to consume or deliver messages to the queue.

Queue cleaner thread:

#0  0x00000039472135ff in std::deque<qpid::broker::QueuedMessage, 
std::allocator<qpid::broker::QueuedMessage> 
>::erase(std::_Deque_iterator<qpid::broker::QueuedMessage, 
qpid::broker::QueuedMessage&, qpid::broker::QueuedMessage*>) () from 
/usr/lib64/libqpidbroker.so.6
#1  0x000000394720ecdd in 
qpid::broker::RingQueuePolicy::find(qpid::broker::QueuedMessage const&, 
std::deque<qpid::broker::QueuedMessage, 
std::allocator<qpid::broker::QueuedMessage> >&, bool) () from 
/usr/lib64/libqpidbroker.so.6
#2  0x0000003947210b57 in 
qpid::broker::RingQueuePolicy::dequeued(qpid::broker::QueuedMessage const&) () 
from /usr/lib64/libqpidbroker.so.6
#3  0x00000039471f0325 in 
qpid::broker::Queue::dequeue(qpid::broker::TransactionContext*, 
qpid::broker::QueuedMessage const&) () from /usr/lib64/libqpidbroker.so.6
#4  0x00000039471f2095 in qpid::broker::Queue::dequeueIf(boost::function1<bool, 
qpid::broker::QueuedMessage&>, std::deque<qpid::broker::QueuedMessage, 
std::allocator<qpid::broker::QueuedMessage> >&) () from 
/usr/lib64/libqpidbroker.so.6
#5  0x00000039471f2a26 in 
qpid::broker::Queue::purgeExpired(qpid::sys::Duration) () from 
/usr/lib64/libqpidbroker.so.6
#6  0x0000003947202d58 in qpid::broker::QueueCleaner::fired() () from 
/usr/lib64/libqpidbroker.so.6
#7  0x0000003948226326 in 
qpid::sys::Timer::fire(boost::intrusive_ptr<qpid::sys::TimerTask>) () from 
/usr/lib64/libqpidcommon.so.6
#8  0x00000039482276a9 in qpid::sys::Timer::run() () from 
/usr/lib64/libqpidcommon.so.6

Other worker threads:

#0  0x00000036f420dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000036f4209343 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00000036f4209227 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000394716097a in qpid::sys::Mutex::lock() () from 
/usr/lib64/libqpidbroker.so.6
#4  0x00000039471f59c6 in 
qpid::broker::Queue::consumeNextMessage(qpid::broker::QueuedMessage&, 
boost::shared_ptr<qpid::broker::Consumer>&) () from 
/usr/lib64/libqpidbroker.so.6
#5  0x00000039471f67ec in 
qpid::broker::Queue::getNextMessage(qpid::broker::QueuedMessage&, 
boost::shared_ptr<qpid::broker::Consumer>&) () from 
/usr/lib64/libqpidbroker.so.6
#6  0x00000039471f687e in 
qpid::broker::Queue::dispatch(boost::shared_ptr<qpid::broker::Consumer>) () 
from /usr/lib64/libqpidbroker.so.6
#7  0x000000394722c507 in 
qpid::broker::SemanticState::ConsumerImpl::doDispatch() () from 
/usr/lib64/libqpidbroker.so.6
#8  0x0000003947228941 in qpid::broker::SemanticState::ConsumerImpl::doOutput() 
() from /usr/lib64/libqpidbroker.so.6
#9  0x00000039482195f2 in qpid::sys::AggregateOutput::doOutput() () from 
/usr/lib64/libqpidcommon.so.6
#10 0x000000394718db49 in qpid::broker::Connection::doOutput() () from 
/usr/lib64/libqpidbroker.so.6
#11 0x000000394715ce39 in qpid::amqp_0_10::Connection::encode(char const*, 
unsigned long) () from /usr/lib64/libqpidbroker.so.6
#12 0x000000394821c377 in 
qpid::sys::AsynchIOHandler::idle(qpid::sys::AsynchIO&) () from 
/usr/lib64/libqpidcommon.so.6
#13 0x000000394813eba6 in 
qpid::sys::posix::AsynchIO::writeable(qpid::sys::DispatchHandle&) () from 
/usr/lib64/libqpidcommon.so.6
#14 0x00000039482225f3 in boost::function1<void, 
qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () 
from /usr/lib64/libqpidcommon.so.6
#15 0x000000394821f4be in 
qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from 
/usr/lib64/libqpidcommon.so.6
#16 0x000000394814b08d in qpid::sys::Poller::run() () from 
/usr/lib64/libqpidcommon.so.6


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to