On Sep 23, 2011, at 9:11 AM, Fraser Adams wrote: > > I'll mention that to the guys when I get back to the office. Though it seems > a bit counterintuitive to me I'd have thought that having a lower number of > worker threads wouldn't utilise the available cores. By "logic" running two > (or even eight) worker threads on your 48 core server seems low - any idea > what's going on to explain your results??
I can't say for sure, but I would guess there's maybe more lock contention going on in the broker when you have more threads. > So have you reproduced the MRG paper results? That paper, which is over three > years old now, has figures of 380,000 256 octet messages in plus out on a 2 x > 4 core Xeon box. We've not come *close* to that figure and my developers are > far from dummies. The paper describes the methodology quite well, but doesn't > quite spell out as a tutorial exactly what the setup was. What numbers are you getting, and how are you testing? > > I don't suppose you (or anyone else) has any help on the other part of my > question about consumer flow control?? I'm not too familiar with consumer flow control, unless you're talking about using a prefetch capacity on a receiver. Andy > Cheers, > Frase > > Andy Goldstein wrote: >> As an experiment, try lowering the # of worker threads for the broker. For >> example, we saw an order of magnitude increase in performance when we >> dropped worker threads from 8 to 2 (on a 48-core server). Our test involved >> creating a ring queue with a max queue count of 250,000 messages. We >> pre-filled the queue with 259 byte messages, and then had a multi-threaded >> client start at least 3 threads, 1 connection/session/sender per thread, and >> had them try to send as many 259 byte messages/second as possible. >> Decreasing the # of worker threads in the broker gave us better throughput. >> >> Andy >> >> On Sep 23, 2011, at 8:05 AM, Fraser Adams wrote: >> >> >>> Hi Andy, >>> I'm afraid that I can't tell you for sure as I'm doing this a bit by >>> "remote control" (I've tasked some of my developers to try and replicate >>> the MRG whitepaper throughput results to give us a baseline top level >>> performance figure). >>> >>> However when I last spoke to them they had tried sending a load of ~900 >>> octet messages to a ring queue set to 2GB, but to rule out any memory >>> issues (shouldn't be as the box has 24GB) they have also tried with a ring >>> queue of the default size of 100M - they got the same problem, it just >>> happened a lot sooner obviously. >>> >>> Fraser >>> >>> >>> Andy Goldstein wrote: >>> >>>> Hi Fraser, >>>> >>>> How many messages can the ring queue hold before it starts dropping old >>>> messages to make room for new ones? >>>> >>>> Andy >>>> >>>> On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote: >>>> >>>> >>>>> Hello all, >>>>> I was chatting to some colleagues yesterday who are trying to do some >>>>> stress testing and have noticed some weird results. >>>>> >>>>> I'm afraid I've not personally reproduced this yet, but I wanted to post >>>>> on a Friday whilst the list was more active. >>>>> >>>>> The set up is firing off messages of ~900 octets in size into a queue >>>>> with a ring limit policy and I'm pretty sure they are using Qpid 0.8 >>>>> >>>>> As I understand it they have a few producers and a consumers and the >>>>> "steady state" message rate is OKish, but if they kill off a couple of >>>>> consumers to force the queue to start filling what seems to happen (as >>>>> described to me) is that when the (ring) queue fills up to its limit (and >>>>> I guess starts overwriting) the consumer rate plummets massively. >>>>> >>>>> As I say I've not personally tried this yet, but as it happens another >>>>> colleague was doing something independently and he reported something >>>>> similar. He was using the C++ qpid::client API and from what I can gather >>>>> did a bit of digging and found a command to disable consumer flow >>>>> control, which seemed to solve his particular issue. >>>>> >>>>> >>>>> Do the scenarios above sound like flow control issues? I'm afraid I've >>>>> not looked much at this and the only documentation I can find relates to >>>>> the producer flow control feature introduced in 0.10 which isn't >>>>> applicable here as a) the issues were seen in a 0.8 broker and b) as far >>>>> as the doc goes producer flow control isn't applied on ring queues. >>>>> >>>>> The colleague who did the tinkering on qpid::client I believe figured it >>>>> out from the low-level doxygen API documentation, but I've not seen >>>>> anything in the higher level documents and I've certainly not seen >>>>> anything in the qpid::messaging or JMS stuff (which is mostly where my >>>>> own experience comes from). I'd definitely like to be able to disable it >>>>> from Java and qpid::messaging too. >>>>> >>>>> >>>>> I'd appreciate a brain dump of distilled flow control knowledge that I >>>>> can pass on if that's possible!!! >>>>> >>>>> >>>>> As an aside, another thing seemed slightly weird to me. My colleagues are >>>>> running an a 16 core Linux box and the worker threads are set to 17 as >>>>> expected however despite running with I think 8 producers and 32 >>>>> consumers the CPU usage reported by top maxes out at 113% this seems >>>>> massively low on a 16 core box and I'd have hoped to see a massively >>>>> higher message rate than they are actually seeing and the CPU usage >>>>> getting closer to 1600%. Is there something "special" that needs to be >>>>> done to make best use out of a nice big multicore Xeon box. IIRC the MRG >>>>> whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This >>>>> isn't something I'm familiar with but looks like it relates to CPU >>>>> affinity, but to my mind that doesn't account for maxing out at only a >>>>> fraction of the available CPU capacity (it's not network bound BTW). >>>>> >>>>> >>>>> Are there any tutorials on how to obtain the absolute maximum super turbo >>>>> message throughput :-) We're not even coming *close* to the figures >>>>> quoted in the MRG whitepaper despite running of more powerful hardware, >>>>> so I'm assuming we're doing something wrong unless the MRG figures are >>>>> massively exaggerated. >>>>> >>>>> >>>>> Many thanks >>>>> Frase >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> Apache Qpid - AMQP Messaging Implementation >>>>> Project: http://qpid.apache.org >>>>> Use/Interact: mailto:[email protected] >>>>> >>>>> >>>> --------------------------------------------------------------------- >>>> Apache Qpid - AMQP Messaging Implementation >>>> Project: http://qpid.apache.org >>>> Use/Interact: mailto:[email protected] >>>> >>>> >>>> >>> --------------------------------------------------------------------- >>> Apache Qpid - AMQP Messaging Implementation >>> Project: http://qpid.apache.org >>> Use/Interact: mailto:[email protected] >>> >>> >> >> >> --------------------------------------------------------------------- >> Apache Qpid - AMQP Messaging Implementation >> Project: http://qpid.apache.org >> Use/Interact: mailto:[email protected] >> >> >> > > > --------------------------------------------------------------------- > Apache Qpid - AMQP Messaging Implementation > Project: http://qpid.apache.org > Use/Interact: mailto:[email protected] > --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:[email protected]
