Re: System stalling

Gordon Sim Tue, 10 Sep 2013 06:28:54 -0700

On 09/06/2013 02:50 PM, Jimmy Jones wrote:

I've done some further digging, and managed to simplify the system a
little to reproduce the problem. The system is now an external
process that posts messages to the default headers exchange on my
machine, which has a ring queue to receive effectively all messages
from the default headers exchange, process them, and post to another
headers exchange. There is now nothing listening on the subsequent
headers exchange, and all exchanges are non-durable. I've also tried
Fraser's suggestion of marking the link as unreliable on the queue
which seems to have no effect (is there any way in the qpid utilities
to confirm the link has been set to unreliable?)


So essentially what happens is the system happily processes away,
normally with an empty ring queue, sometimes it spikes up a bit and
goes back down again, with my ingest process using ~70% CPU and qpidd
~50% CPU, on a machine with 8 CPU cores. However sometimes the queue
spikes up to 2GB (the max), starts throwing messages away, and qpid
hits 100%+ CPU and the ingest process goes to about 3% CPU. I can see
messages are being very slowly processed.

I've tried attaching to qpidd with gdb a few times, and all threads
apart from one seem to be idle in epoll_wait or pthread_cond_wait.
The running thread always seems to be somewhere under
DispatchHandle::processEvent.

In this simplified system, is the ingest process still blocking onwaitForCompletion() in send()?

If so, I think that is the key symptom. That slows down the processingof messages into the ingest process, which in turn causes the producerrate to the input queue to exceed the consume rate, the queue backs upand then messages need to be dropped.

The question is why the completions aren't being sent by the broker forthe messages resent by the ingest process. You don't have any queuesbound to the exchange they are being sent to. Do you have analternate-excahnge specified for that second headers exchange? (And ifso, what if any, queues are bound to that)? What are the stats for thatsecond exchange at the point the problem occurs (qpid-stat -e)? What isthe capacity of the sender in the ingest process (or is it left at thedefault value)? Is it the only sender on the session?

What level of logging do you have on? If you don't have it already,maybe see if you can reproduce with logging at info+, just in case thereare any clues there.

When this situation occurs, if you stop the external process, does thesystem eventually clear itself, or does the ingest process remainblocked once it gets into that situation?

That's rather a lot of questions I'm afraid... just looking for someclue to latch on to.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: System stalling

Reply via email to