On 03/09/2012 04:10 PM, Fraser Adams wrote:
On 09/03/12 14:22, Gordon Sim wrote:

Can you (or have you) tracked the queue depth, connection and session
stats for the broker exhibiting the problem? Anything you can think of
that might correlate with the rate of growth (e.g. does it look like
its per message)?

Unfortunately nothing looks especially anomalous. As I mentioned earlier
in this thread we did have what I suspect was a dodgy network switch
which caused the producer host NIC to negotiate half-duplex. That hosed
network performance and caused things to back up on the (circular) queue
and was also causing regular reconnections (not mega-fast ~every 10
minutes) so initially I wondered if that was where the problem may lie,
however since we sorted the network problem it has been running as fast
as I'd expect with nothing obvious in the logs (no reconnections now)
and a queue that now stays pretty close to empty.

As I say below we've got a 0.8 qpid::client producer delivering to
amq.match on a broker co-located on the same host which is federated to
another 0.8 broker (all brokers are c++) via a source queue route.

That is all that's happening on the problem broker?

Yeah the problem broker doesn't have much going on at all. It has a
single queue, the producer is writing to amq.match and there are a
couple of matches in the x-match-all binding. The queue is sized to 1G
the server has 24GB. The queue is a basic in-memory ring queue so no
persistence or anything going on.

We are using source routes set up with qpid-config -s

That has me thinking I guess that I should standard routes just to rule
out source routes as a potential problem. I don't suppose you are aware
of any potential issues with source routes (he says clutching at
straws....)

Definitely worth trying as that is the less travelled path and its always possible there is something we haven't seen before. There is nothing I'm aware of however.

Are you using acknowledgements on your federated bridge?

The producer is a slightly unusual fairly real time thing that mmaps
it's main chunk of memory and uses an internal memory pool to optimise
multi-threaded memory access but I believe that it is only grabbing ~8GB
rpm -qv <name> will give you the versioned name of the rpms which may
shed some light...

Is there a list of libraries that qpid has dependencies on available
anywhere so I can compare the info I get from the above command with
what *should* be in place on the host?

There is no formal list. Many (most?) of them are optional also, so it depends what you compiled with. Boost is the key dependency and you do want to make sure you use the same version of boost that you compiled against. Not doing so causes problems, though usually more pronounced than this so I don't suggest that is your issue.

What versions of the qpid rpms and boost are you using on the two boxes?

Nothing that should cause leaks... are you running any RHEL6?
I *think* it might be RHEL5, unfortunately I can't check at the moment
as I'm mailing from home. Why did you ask about RHEL6, are there any
issues with that?

There was an issue with the memory allocation strategy in RHEL6, where a the per-thread pools of memory didn't work well in the case of a thread that always worked on producing (hence allocating) and another that did all the consuming (hence freeing).

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to