Jason Shepherd created AMQ-5077:
-----------------------------------
Summary: Improve performance of ConcurrentStoreAndDispatch
Key: AMQ-5077
URL: https://issues.apache.org/jira/browse/AMQ-5077
Project: ActiveMQ
Issue Type: Wish
Components: Message Store
Affects Versions: 5.9.0
Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Priority: Minor
We have publishers publishing to a topic which has 5 topic -> queue routings,
and gets a max message rate attainable of ~833 messages/sec, with each message
around 5k in size.
To test this i set up a JMS config with topic queues:
Topic
TopicRouted.1
...
TopicRouted.11
Each topic has an increasing number of routings to queues, and a client is set
up to subscribe to all the queues.
Rough message rates:
routings messages/sec
0 2500
1 1428
2 2000
3 1428
4 1111
5 833
This occurs whether the broker config has producerFlowControl="false" set to
true or false , and KahaDB disk synching is turned off. We also tried
experimenting with concurrentStoreAndDispatch, but that didn't seem to help.
LevelDB didn't give any notable performance improvement either.
We also have asyncSend enabled on the producer, and have a requirement to use
persistent messages. We have also experimented with sending messages in a
transaction, but that hasn't really helped.
It seems like producer throughput rate across all queue destinations, all
connections and all publisher machines is limited by something on the broker,
through a mechanism which is not producer flow control. I think the prime
suspect is still contention on the index.
We did some test with Yourkit profiler.
Profiler was attached to broker at startup, allowed to run and then a topic
publisher was started, routing to 5 queues.
Profiler statistics were reset, the publisher allowed to run for 60 seconds,
and then profiling snapshot was taken. During that time, ~9600 messages were
logged as being sent for a rate of ~160/sec.
This ties in roughly with the invocation counts recorded in the snapshot (i
think) - ~43k calls.
>From what i can work out, in the snapshot (filtering everything but
>org.apache.activemq.store.kahadb),
For the 60 second sample period,
24.8 seconds elapsed in
org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
MessageAck).
18.3 seconds elapsed in
org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
Message, boolean).
>From these, a further large portion of the time is spent inside
>MessageDatabase:
org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
Location) - 10 secs elapsed
org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
Location) - 8.5 secs elapsed.
As both of these lock on indexLock.writeLock(), and both take place on the NIO
transport threads, i think this accounts for at least some of the message
throughput limits. As messages are added and removed from the index one by one,
regardless of sync type settings, this adds a fair amount of overhead.
While we're not synchronising on writes to disk, we are performing work on the
NIO worker thread which can block on locks, and could account for the behaviour
we've seen client side.
To Reproduce:
1. Install a broker and use the attached configuration.
2. Use the 5.8.0 example ant script to consume from the queues,
TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1
-Duser=admin -Dpassword=admin -Dmax=-1
3. Use the modified version of 5.8.0 example ant script (attached) to send
messages to topics, TopicRouted.1 - 5, eg:
ant producer
-Durl='tcp://localhost:61616?jms.useAsyncSend=true&wireFormat.tightEncodingEnabled=false&keepAlive=true&wireFormat.maxInactivityDuration=60000&socketBufferSize=32768'
-Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 -Dtopic=true
-DsleepTime=0 -Dmax=10000 -DmessageSize=5000
This modified version of the script prints the number of messages per second
and prints it to the console.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)