Hello, I have experience with very high volume pub/sub using Tibco Rendezvous (multicast) for an internal monitoring & business analytics system that I led the development of at eBay. That system routinely had over 1Gbps of data in flight on the datacenter's GigE network, with dozens of blade servers publishing, and even more blade servers subscribing.
I'm now at a different company, and we're building products that will have a similar architecture, though likely more modest data volumes. We're using the ActiveMQ 5.2.0 release and ActiveMQ-CPP 2.2.2 release. I'm still coming up to speed on the ActiveMQ architecture, configuration, tools, etc. Over the last couple weeks I've modified the TopicPublisher and TopicListener examples to determine what level of throughput can be obtained. My modified TopicPublisher spins up multiple connections, each connection publishing to multiple topics. The messages published are BytesMessages that simply have an array of 1000 random bytes. I use a ScheduledExecutorService.scheduleAtFixedRate() to run tasks that are triggered every 10 milliseconds. The tasks send a burst of messages. The number of messages in the burst is computed to achieve a desired aggregate bandwidth of data published, specified in Megabits per second. I was very pleased to find that with the servers I have available for testing (8-core 1.6Ghz Xeons with 8Gb RAM running CentOS 5.2) that I was able to sustain about 500Mbps of physical data (i.e. including TCP header and OpenWire overhead) from one publisher, through one broker, to one listener, and run this test for hours without any problems. (For those used to thinking in terms of messages per second, this is 50K messages/second with 1K byte messages.) Even better, I can add a second listener, connecting to the broker on a 2nd ethernet interface, such that the broker was delivering a total of ~1Gbps of data to the two listeners. This is excellent performance and gave me a great deal of confidence that we could use ActiveMQ for our products. However, I am now trying to write a listener using ActiveMQ-CPP 2.2.2, and finding that it can't even come close to achieving the throughput that the Java listener achieves. I started with the SimpleAsyncConsumer sample and modified it to spin up multiple connections, with each connection subscribing to a different topic (equivalent to my modified java TopicListener). The only thing this application does is receive the messages as fast as possible, and for each message use BytesMessage::getBodyLength() to keep a running total of bytes received (again, equivalent to the java listener). So far, the C++ listener can only handle less than 1/4th of the volume of data that the Java listener can handle. If I keep the data rate low enough, the C++ listener seems to be able to run fine. But when I push the data rate up to 120Mbps, all three components (publisher, broker, listener) freeze up in less than a half minute. The broker admin console shows greater than 90% of the memory in use. Killing the listener and the publisher leaves the broker in the same state, and so far the only solution I am aware of is to kill and restart the broker. I don't yet know if this is purely a "slow consumer" problem, or if the consumer becomes "slow" because it deadlocks (I have a pstack output that I'm going to study today and would be happy to make available). I suspect the latter, since I haven't yet seen any indications of just "slow" performance before the lockup happens (but I am not yet looking at advisory messages, which I realize is a major oversight). FYI, I am currently using the default configuration for the broker, but I do the following at runtime to configure the pub/sub: In the Java publisher: 1. Sessions are created with AUTO_ACKNOWLEGE 2. Delivery mody is NON_PERSISTENT 3. Time to live is 10 seconds In the Java TopicListener: 1. Sessions are created with AUTO_ACKNOWLEGE 2. Broker URI does not specify any parameters (i.e. do not specify jms.prefetchPolicy.all) 3. Topic URIs do not specify any parameters (i.e. do not specify consumer.maximumPendingMessageLimit) In the C++ Consumer: 1. Sessions are created with AUTO_ACKNOWLEGE 2. Broker URI includes "?jms.prefetchPolicy.all=2000" 3. Topic URIs include "?consumer.maximumPendingMessageLimit=4000" Note that while both the TopicListener and the SimpleAsyncConsumer used asynchronous dispatch, I have modified both to do synchronous receives in their own threads. For the C++ consumer, this results in 3 threads per connection, and I have been testing with 8 connections. One experiment I want to do today is revert to asynchronous dispatch, assuming this will bring me back to 2 threads per connection. I still have other investigation that I want to do, and it is possible that this investigation will result in being able to provide enough specifics to file a bug report. It's also possible that I'll find that I've made some newbie mistake. However, in some of my research I've done so far I've seen indications that ActiveMQ-CPP 2.2.1 had known problems similar to these, and at least one known deadlock related to CmsTemplate still exists in the 2.2.2 release. I am writing because I would appreciate help from AMQ developers or any experienced users in the AMQ community who would be interested in checking my work to rule out newbie mistakes. I would be happy to make the source code for my modified examples available to anyone that is interested. Some questions I would like to ask here: What is the right way to configure publishers, brokers, and listeners for high volumes of messages when some data loss is considered entirely acceptable? Suppose a system is allowed to have only two nines (99.0%) SLA (measure monthly) for message delivery if that is required to achieve high stability? Can the broker be configured such that it will never deadlock even if a consumer deadlocks? Thanks, Jim Lloyd Principal Architect Silver Tail Systems Inc.