Thanks Tim. I'm in the process of subscribing to advisories. I'd like to have done at least that before I post the code, but I hope to do that before the end of the day today.
-Jim On Thu, Dec 4, 2008 at 1:13 PM, Timothy Bish <[EMAIL PROTECTED]> wrote: > Feel free to post or send any code that you would like reviewed in > regards to the modified CPP examples, I will take a look at let you know > if I see any obvious gotchas. The deadlock that is currently known > seems to only crop up when using CmsTemplate and only at shutdown, so if > it is deadlocking on high volume its probably something new that we > haven't seen yet. > > Obviously if you can come up with some samples that can lock up the > client those would be invaluable in finding the root cause. > > Regards > Tim > > On Thu, 2008-12-04 at 11:55 -0800, Jim Lloyd wrote: > > Hello, > > > > I have experience with very high volume pub/sub using Tibco Rendezvous > > (multicast) for an internal monitoring & business analytics system that I > > led the development of at eBay. That system routinely had over 1Gbps of > data > > in flight on the datacenter's GigE network, with dozens of blade servers > > publishing, and even more blade servers subscribing. > > > > I'm now at a different company, and we're building products that will > have a > > similar architecture, though likely more modest data volumes. We're using > > the ActiveMQ 5.2.0 release and ActiveMQ-CPP 2.2.2 release. I'm still > coming > > up to speed on the ActiveMQ architecture, configuration, tools, etc. Over > > the last couple weeks I've modified the TopicPublisher and TopicListener > > examples to determine what level of throughput can be obtained. > > > > My modified TopicPublisher spins up multiple connections, each connection > > publishing to multiple topics. The messages published are BytesMessages > that > > simply have an array of 1000 random bytes. I use a > > ScheduledExecutorService.scheduleAtFixedRate() to run tasks that are > > triggered every 10 milliseconds. The tasks send a burst of messages. The > > number of messages in the burst is computed to achieve a desired > aggregate > > bandwidth of data published, specified in Megabits per second. > > > > I was very pleased to find that with the servers I have available for > > testing (8-core 1.6Ghz Xeons with 8Gb RAM running CentOS 5.2) that I was > > able to sustain about 500Mbps of physical data (i.e. including TCP header > > and OpenWire overhead) from one publisher, through one broker, to one > > listener, and run this test for hours without any problems. (For those > used > > to thinking in terms of messages per second, this is 50K messages/second > > with 1K byte messages.) Even better, I can add a second listener, > connecting > > to the broker on a 2nd ethernet interface, such that the broker was > > delivering a total of ~1Gbps of data to the two listeners. This is > excellent > > performance and gave me a great deal of confidence that we could use > > ActiveMQ for our products. > > > > However, I am now trying to write a listener using ActiveMQ-CPP 2.2.2, > and > > finding that it can't even come close to achieving the throughput that > the > > Java listener achieves. I started with the SimpleAsyncConsumer sample and > > modified it to spin up multiple connections, with each connection > > subscribing to a different topic (equivalent to my modified java > > TopicListener). The only thing this application does is receive the > messages > > as fast as possible, and for each message use > BytesMessage::getBodyLength() > > to keep a running total of bytes received (again, equivalent to the java > > listener). > > > > So far, the C++ listener can only handle less than 1/4th of the volume of > > data that the Java listener can handle. If I keep the data rate low > enough, > > the C++ listener seems to be able to run fine. But when I push the data > > rate up to 120Mbps, all three components (publisher, broker, listener) > > freeze up in less than a half minute. The broker admin console shows > greater > > than 90% of the memory in use. Killing the listener and the publisher > leaves > > the broker in the same state, and so far the only solution I am aware of > is > > to kill and restart the broker. > > > > I don't yet know if this is purely a "slow consumer" problem, or if the > > consumer becomes "slow" because it deadlocks (I have a pstack output that > > I'm going to study today and would be happy to make available). I suspect > > the latter, since I haven't yet seen any indications of just "slow" > > performance before the lockup happens (but I am not yet looking at > advisory > > messages, which I realize is a major oversight). > > > > FYI, I am currently using the default configuration for the broker, but I > do > > the following at runtime to configure the pub/sub: > > > > In the Java publisher: > > > > 1. Sessions are created with AUTO_ACKNOWLEGE > > 2. Delivery mody is NON_PERSISTENT > > 3. Time to live is 10 seconds > > > > In the Java TopicListener: > > > > 1. Sessions are created with AUTO_ACKNOWLEGE > > 2. Broker URI does not specify any parameters (i.e. do not specify > > jms.prefetchPolicy.all) > > 3. Topic URIs do not specify any parameters (i.e. do not specify > > consumer.maximumPendingMessageLimit) > > > > In the C++ Consumer: > > > > 1. Sessions are created with AUTO_ACKNOWLEGE > > 2. Broker URI includes "?jms.prefetchPolicy.all=2000" > > 3. Topic URIs include "?consumer.maximumPendingMessageLimit=4000" > > > > Note that while both the TopicListener and the SimpleAsyncConsumer used > > asynchronous dispatch, I have modified both to do synchronous receives in > > their own threads. For the C++ consumer, this results in 3 threads per > > connection, and I have been testing with 8 connections. One experiment I > > want to do today is revert to asynchronous dispatch, assuming this will > > bring me back to 2 threads per connection. > > > > I still have other investigation that I want to do, and it is possible > that > > this investigation will result in being able to provide enough specifics > to > > file a bug report. It's also possible that I'll find that I've made some > > newbie mistake. However, in some of my research I've done so far I've > seen > > indications that ActiveMQ-CPP 2.2.1 had known problems similar to these, > and > > at least one known deadlock related to CmsTemplate still exists in the > 2.2.2 > > release. > > > > I am writing because I would appreciate help from AMQ developers or any > > experienced users in the AMQ community who would be interested in > checking > > my work to rule out newbie mistakes. I would be happy to make the source > > code for my modified examples available to anyone that is interested. > > > > Some questions I would like to ask here: What is the right way to > configure > > publishers, brokers, and listeners for high volumes of messages when some > > data loss is considered entirely acceptable? Suppose a system is allowed > to > > have only two nines (99.0%) SLA (measure monthly) for message delivery if > > that is required to achieve high stability? Can the broker be configured > > such that it will never deadlock even if a consumer deadlocks? > > > > Thanks, > > Jim Lloyd > > Principal Architect > > Silver Tail Systems Inc. > >