[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-04-30 Thread Gary Tully (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985636#comment-13985636
 ] 

Gary Tully commented on AMQ-5077:
-

on the store write thread delay. Currently a single thread pulls from the 
pending async writes queue (so long as there are 1 pending writes) - so 
there is an implicit delay in many concurrent producers relying to a single 
thread. The difficulty was getting multiple writes queued up; a single 
connection can now queue up writes. To improve on this, use multiple concurrent 
producers.

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  -Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 -Dtopic=true 
 -DsleepTime=0 -Dmax=1 -DmessageSize=5000
 This modified version of the script prints the number of messages per second 
 and prints it to the console.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-04-23 Thread Gary Tully (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978299#comment-13978299
 ] 

Gary Tully commented on AMQ-5077:
-

[~rwagg] I added a concurrentSend option to the composite destination and this 
reduces the latency because the writes can be batched.

{code}compositeTopic name=TopicRouted.5.Embedded forwardOnly=true 
concurrentSend=true {code}

Note, concurrentStoreAndDispatch gets in the way when there are no consumers so 
for the graph/test[1] it is disabled. 

With concurrentStoreAndDispatch enabled, the pending write queue can get some 
depth which will allow fast consumers to negate the write.

changes  http://git-wip-us.apache.org/repos/asf/activemq/commit/08bb172f

[1] http://www.chartgo.com/get.do?id=7f99485050

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  -Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-04-01 Thread Gary Tully (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956461#comment-13956461
 ] 

Gary Tully commented on AMQ-5077:
-

ahh sorry, you are correct, the CompositeDestinationFilter is taking on the 
fanout send and it ignores the producerWindow altogether.

So I think there are two ways to improve this in a derivative 
CompositeDestinationFilter (a new once can be provided in xml config) that will 
still have the sends pending and will avoid the need for a routing topic.

1) introduce an executor that can forward in parallel - so we can make better 
use of concurrent store and dispatch and batching to disk for the composite 
fanout.
2) respect the producerWindow for the executor queue, such that lots of pending 
sends can accumulate, allowing producer bursts up to a limit.

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-04-01 Thread Richard Wagg (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956570#comment-13956570
 ] 

Richard Wagg commented on AMQ-5077:
---

Hi,
I'm happy that those 2 changes would help the producer in not getting blocked 
by the JMS.

It is also likely to improve throughput to the consumers, but i think we could 
make further improvements there. 
A new executor using  Concurrent store and dispatch is likely to result in more 
messages in flight to the consumers - but i'm still concerned that the 
performance of this (this being the ultimate queue writes/sec the broker can 
achieve) will be hard to determine, as it will be a function of how quick the 
underlying disk store is as well as the average roundtrip time for the consumer 
to receive, process and ACK each message. If we have a large queue of messages 
being written to the diskstore, and relatively quick consumers, then we can 
optimise away the disk writes - but i'm not sure that this is visible at the 
moment. 

I would still be interested in i guess some form of delay queue for the 
diskstore writes - a configurable property for a minimum delay to wait before 
writing messages through to the index/diskstore, which you could benchmark 
against your expected consumer ACK roundtrip time to determine if you expect to 
be able to optimise away the disk writes completely under most situations. 
If this delay is small enough, and the producer window doesn't get decremented 
till either the diskstore write completes or the consumer ACK arrives, we could 
still have some resilience against message loss with this in place. 

Do you think that would be a useful option, or cause more problems than it 
could solve?

Thanks,
Richard 

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are 

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-03-31 Thread Richard Wagg (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955378#comment-13955378
 ] 

Richard Wagg commented on AMQ-5077:
---

Calling  
connectionFactory.setProducerWindowSize() 
With sizes varying from 10k to 10Mb has no effect on the throughput i can 
attain. All stack traces i take of the producer catch it in code like:
{noformat}
main prio=10 tid=0x0bc3b000 nid=0x4109 runnable [0x41ebe000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at 
org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
at 
org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
at 
org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
at 
org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
at 
org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
at 
org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:658)
- locked 0x00050f60c5e8 (a java.lang.Object)
at 
org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
at 
org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
at 
org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1321)
at 
org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1315)
at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1853)
- locked 0x00050f60c668 (a java.lang.Object)
at 
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:289)
at 
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:224)
at 
org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269)

{noformat}


My understanding of flow control  the producer window size: 

Client side: 
- window size is set.
- Before each send, current size of all messages in flight is checked to see if 
window is exceeded. 
- if producerWindow.waitForSpace() doesn't block, then the message is sent. 
- After the message is sent, the producer in flight size is incremented by the 
message size (and decremented when the ack is received). 

Broker side:
- Each queue has a memory limit set, as well as overall memory limit and disk 
store limit. 
- For each message dispatched for a given queue, each of these limits is 
checked. 
- if any limit is set and sendFailIfNoSpace is set to true, the producer should 
get an exception sent back. 

In none of my tests have i caught any thread stuck inside the flow control 
handling logic. In all cases they're inside network code - producer side as 
above, broker side in something like: 
{noformat}
ActiveMQ NIO Worker 29 daemon prio=10 tid=0x1775d000 nid=0x6a0d 
runnable [0x4473d000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000638b62430 (a 
org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask$InnerFutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
at java.util.concurrent.FutureTask.get(FutureTask.java:187)
at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:942)
at org.apache.activemq.broker.region.Queue.send(Queue.java:727)
at 
org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:395)
at 
org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:441)
at 
org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:297)
at 
org.apache.activemq.broker.region.virtual.CompositeDestinationFilter.send(CompositeDestinationFilter.java:86)
at 
org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:395)
at 
org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:441)
at 
org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:297)
at 
org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96)
at 
org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:307)
at 

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-03-28 Thread Gary Tully (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950866#comment-13950866
 ] 

Gary Tully commented on AMQ-5077:
-

With the router, a persistent message to a topic with no durable consumers is 
pass through - so the messages backup in the subscription pending dispatch.
This is not unlike a send to the virtual topic with a very large 
producerWindow, in that case the messages will back up (to the window limit) 
pending send. 
In both cases the messages are pending in the broker memory, but in the 
producerWindow[1] case, a failover client may retain the messages pending a 
reply so a failover would resend.
It is really a case of where to store the messages in memory and whether they 
need recovery.

[1] org.apache.activemq.ActiveMQConnectionFactory#setProducerWindowSize

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-03-28 Thread Richard Wagg (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951061#comment-13951061
 ] 

Richard Wagg commented on AMQ-5077:
---

Leaving aside the question of message loss on failover, 
We have 2 goals here:
- getting the maximum possible throughput to consumers 
- Never blocking/delaying a producer until the JMS hits an understood/visible 
limit (memory/diskstore). 
I need to come up with a better test case to see how a larger 
ProducerWindowSize affects this, but for the moment i don't believe it's 
working as we would want it to. 

Currently the limit on the rate which we're able to deliver messages from 
producers to consumers is the speed at which the JMS can write/remove messages 
from the index  diskstore. 
This happens in such a way that producers block on the send() call. 

org.apache.activemq.store.kahadb.KahaDBStore:
public FutureObject asyncAddQueueMessage(final ConnectionContext context, 
final Message message)
public void removeAsyncMessage(ConnectionContext context, MessageAck ack)

My reading of the code is that messages can be dispatched before the store task 
has completed, and if the ACK arrives before the store completes, then the 
store operation is cancelled. 
This also implies that a message could be delivered without being written to 
disk. I'm not sure at what point in this process the producer receives the ACK. 

If the consumer were quick enough to receive, process and ACK the messages in 
question, then we'd optimise away the need to ever write to the diskstore, and 
not have an issue here. 
However, our SAN seems to be fast enough, combined with network latency, to 
ensure that the disk writes are nearly always in progress before the ACK 
arrives. 

In this case, all the work to write/remove messages from the diskstore  index, 
and the synchronisation overheads of doing this, happen on the NIO worker 
threads. 
This delays the producers in a way that isn't visible to the producer. Whether 
sending messages synch or async, all the producer code sees is that calling 
send() takes longer. 
My understanding is that increasing the producer send window would allow it to 
keep more messages in flight before it has received ACKs for them - but would 
not help when it's blocked at the network level. 
I'll see if i can come up with a more specific test case that shows the effect 
of varying the producer send window. 

What i think we're looking for is some option along the lines of 
ConcurrentDispatchThenStoreIfNeeded - first dispatch the message, then wait for 
a timeout period, and then only persist the message to the diskstore incurring 
the disk/synchronisation penalties if an ACK doesn't arrive on time. 
This would be for a low (100ms?) time, respect memory limits on the broker for 
total messages in flight, and would allow the producer send rate to scale with 
the slowest consumer receive speed, rather than the sum of all queue writes 
possible on the JMS. 

Current behaviour: 

Producer - broker with topic - queue routings - consumers:
- Producer is blocked by speed at which broker can write to all queues. 
Consumers receive messages at speed JMS can write. Queue write limit is global. 

Producer - broker with embedded routing bean - consumers (router waits for 
send call to complete before acking the message received): 
- Producer is able to write up to producer window size. Embedded bean is 
blocked by broker write speed - consumers receive messages at speed JMS can 
write. Queue write limit is global. 

Ideal situation: 
Producer - broker with topic - queue routings and 
ConcurrentDispatchThenStoreIfNeeded set: 
Producer is able to write up to producer window size. Broker is able to 
dispatch to consumers at consumer receive rate limit, only writing to disk if 
consumers slow down. 

Does that make sense, or do you think i'm misunderstanding the issue here? 

Thanks,
Richard

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Assignee: Gary Tully
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz, 
 topicRouting.zip


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-02-28 Thread Gary Tully (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915908#comment-13915908
 ] 

Gary Tully commented on AMQ-5077:
-

some thoughts - without having looked at the particulars.
for async producers - are you setting a producerWindow? that should allow 
pending sends to accumulate broker side.
I think the root problem is ack contention for the index lock - some sort of 
ack batching would help there.
one other thought - for composite dests - we could send to each dest in 
parallel - I don't think we do atm.

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Priority: Minor
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  -Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 -Dtopic=true 
 -DsleepTime=0 -Dmax=1 -DmessageSize=5000
 This modified version of the script prints the number of messages per second 
 and prints it to the console.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-02-28 Thread Richard Wagg (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915975#comment-13915975
 ] 

Richard Wagg commented on AMQ-5077:
---

Hi,
We weren't setting a producer window before, so whatever's default was used. 
Have tried some quick tests now - setting abnormally high/low sizes have no 
effect (1k, 10k, 100k, 1mb, 10mb).
Is this used if flow control is disabled? 

Client side, we've seen calls blocked at the network level, rather than any AMQ 
specific code - i think regardless of the window size, it's not been hit at the 
time the network call is blocked. I've been using mostly NMS clients for the 
testing as we'd thought the problem was initially in the NMS library - i'll get 
a java test set up next week and go through the code path it takes and where 
the time is spent waiting. 

I think we can either say that singular acks contending on the index lock is a 
problem, or that we write the messages to the index too quickly in the first 
place - if we just want to handle the ideal case, where consumers are able to 
ACK the message near-instantaneously, then we could remove the need to write 
the messages and then remove them in short order from the index just by 
delaying the writes by the roundtrip time from consumer to broker. It might 
reduce durability, but we've already accepted that tradeoff with other settings 
- so it makes sense to me to try and optimise away the index writes/contention, 
rather than attempt to batch up the acks, which might cause less contention but 
probably guarantees that the messages are written in the first place. 

Thanks,
Richard

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Priority: Minor
 Attachments: Test combinations.xlsx, compDesPerf.tar.gz


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not 

[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-02-27 Thread Richard Wagg (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914756#comment-13914756
 ] 

Richard Wagg commented on AMQ-5077:
---

Hi,
Attached is some more different test configurations i've tried and results. 
Nothing seems to massively affect the throughput for better or worse. 
In none of the tests have the consumers had any message backlog. 

To try and sum up the problem: 
- Under normal operation we want our producers to remain unblocked for as long 
as possible (flow control is fine, but that to me means that the producer works 
uninterrupted up until a memory/disk limit is reached, then PFC kicks in, 
rather than a gradual degradation).
- Clients all run in CLIENT_ACK mode - messages are ACKed one by one. 
- Both the diskstore and the network are relatively quick - tests running 
against topics show a roundtrip time of ~ 0.740ms (producer - broker - 
consumer - broker - producer reply). 
- The ability of the producers to send messages is currently limited by some 
TCP level limitation, due to the amount of work the broker is doing on it's 
receive threads. 
- The observed behaviour in producer code is that whether sending sync or 
async, the call to producer.send() just blocks - so even async sends are 
affected by JMS throughput, in a manner which isn't flow control. 
- In none of our tests have the consumers ever had a large pending message 
count - the blocking factor is not consumer speed or memory/queue limits. 
- Taking thread dumps throughout these tests, we can see contention around the 
synchronised access points in MessageDatabase - the process() method taking 
arguments  KahaAddMessageCommand and KahaRemoveMessageCommand both lock around 
the page index. 
- Any sort of option to batch consumer acks might remove the number of single 
messages removed, but would also delay the ACKs to the point where more is 
written to the store. 
- Some options in the JMS (ConcurrentStoreAndDispatch) are supposed to allow 
optimisations in this scenario, but appear to have limited or no effect. 

I think there are 2 options for the root problem: 
1. Disk writes are too quick (or too many disk writes are allowed to be in 
progress) - by the time the roundtrip from broker - consumer and back 
completes, the kahaDb write is already done or in progress.
2. Thread contention stops the ACKs arriving in time to prevent the diskstore 
writes from happening, negating the benefit of allowing 
concurrentStoreAndDispatch from a performance point of view (clients might 
receive messages quicker, but broker still has to add and remove each message 
serially from the diskstore). 

I think the second option is most likely - we're effectively doing a lot of 
disk based work that we don't need to do, just because the consumer ACKs aren't 
coming back quick enough or aren't able to be received before the disk write is 
in progress, causing a double hit on both thread synchronisation, and then 
again at the disk level adding and removing the same message inside a small 
timeframe. 

Ideas welcome for configurations to try or areas to look at. Stuff i'm going to 
try:
- Add some debug logging to see the queue length of the asyncQueueJobQueue in 
KahaDBStore
- Changing client ack mode - optimizeAcknowledge, DUPS_OK_ACK etc. Don't think 
it'll have much effect but it's something else to rule out. 

I would be interested in trying any code which would allow the disk writes to 
go on a delay queue - something along the lines of dispatch straight away, only 
write to kahaDB if the ACK hasn't arrived after a configurable interval. 
I'm still not sure that a lot of this work should be done on the NIO send 
threads - even if the contention on the kahaDb store needs to be allowed to 
happen, i would expect requests to be allowed to queue up in memory, until a 
PFC limit kicks in. Until that point i wouldn't expect producer send 
performance to be affected. 

Diskstore performance for comparison:
[activemq@londngnfrjms01 lib]$ /opt/java/x64/jdk1.7.0_51/bin/java -cp 
activemq-kahadb-store-5.9.0.redhat-610350.jar 
org.apache.activemq.store.kahadb.disk.util.DiskBenchmark 
/jms_vsp/activemq-rh590/data/test.dat
Benchmarking: /jms_vsp/activemq-rh590/data/test.dat
Writes:
  1023993 writes of size 4096 written in 10.69 seconds.
  95789.805 writes/second.
  374.17892 megs/second.

Sync Writes:
  49746 writes of size 4096 written in 10.001 seconds.
  4974.1025 writes/second.
  19.430088 megs/second.

Reads:
  5468429 reads of size 4096 read in 10.001 seconds.
  546788.25 writes/second.
  2135.8916 megs/second.

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 

Re: [jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-02-27 Thread artnaseef
Did I read that correctly - there's a possible concern that disk writes are
too fast?



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/jira-Commented-AMQ-5077-Improve-performance-of-ConcurrentStoreAndDispatch-tp4678395p4678401.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.


[jira] [Commented] (AMQ-5077) Improve performance of ConcurrentStoreAndDispatch

2014-02-26 Thread Jason Shepherd (JIRA)

[ 
https://issues.apache.org/jira/browse/AMQ-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914140#comment-13914140
 ] 

Jason Shepherd commented on AMQ-5077:
-

This issue is also logged in the enterprise 6.1 branch here:

   https://issues.jboss.org/browse/ENTMQ-569

 Improve performance of ConcurrentStoreAndDispatch
 -

 Key: AMQ-5077
 URL: https://issues.apache.org/jira/browse/AMQ-5077
 Project: ActiveMQ
  Issue Type: Wish
  Components: Message Store
Affects Versions: 5.9.0
 Environment: 5.9.0.redhat-610343
Reporter: Jason Shepherd
Priority: Minor
 Attachments: compDesPerf.tar.gz


 We have publishers publishing to a topic which has 5 topic - queue routings, 
 and gets a max message rate attainable of ~833 messages/sec, with each 
 message around 5k in size.
 To test this i set up a JMS config with topic queues:
 Topic
 TopicRouted.1
 ...
 TopicRouted.11
 Each topic has an increasing number of routings to queues, and a client is 
 set up to subscribe to all the queues.
 Rough message rates:
 routings messages/sec
 0 2500
 1 1428
 2 2000
 3 1428
 4 
 5 833
 This occurs whether the broker config has producerFlowControl=false set to 
 true or false , and KahaDB disk synching is turned off. We also tried 
 experimenting with concurrentStoreAndDispatch, but that didn't seem to help. 
 LevelDB didn't give any notable performance improvement either.
 We also have asyncSend enabled on the producer, and have a requirement to use 
 persistent messages. We have also experimented with sending messages in a 
 transaction, but that hasn't really helped.
 It seems like producer throughput rate across all queue destinations, all 
 connections and all publisher machines is limited by something on the broker, 
 through a mechanism which is not producer flow control. I think the prime 
 suspect is still contention on the index.
 We did some test with Yourkit profiler.
 Profiler was attached to broker at startup, allowed to run and then a topic 
 publisher was started, routing to 5 queues. 
 Profiler statistics were reset, the publisher allowed to run for 60 seconds, 
 and then profiling snapshot was taken. During that time, ~9600 messages were 
 logged as being sent for a rate of ~160/sec.
 This ties in roughly with the invocation counts recorded in the snapshot (i 
 think) - ~43k calls. 
 From what i can work out, in the snapshot (filtering everything but 
 org.apache.activemq.store.kahadb), 
 For the 60 second sample period, 
 24.8 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.removeAsyncMessage(ConnectionContext,
  MessageAck).
 18.3 seconds elapsed in 
 org.apache.activemq.store.kahadb.KahaDbTransactionStore$1.asyncAddQueueMessage(ConnectionContext,
  Message, boolean).
 From these, a further large portion of the time is spent inside 
 MessageDatabase:
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaRemoveMessageCommand,
  Location) - 10 secs elapsed
 org.apache.activemq.store.kahadb.MessageDatabase.process(KahaAddMessageCommand,
  Location) - 8.5 secs elapsed.
 As both of these lock on indexLock.writeLock(), and both take place on the 
 NIO transport threads, i think this accounts for at least some of the message 
 throughput limits. As messages are added and removed from the index one by 
 one, regardless of sync type settings, this adds a fair amount of overhead. 
 While we're not synchronising on writes to disk, we are performing work on 
 the NIO worker thread which can block on locks, and could account for the 
 behaviour we've seen client side. 
 To Reproduce:
 1. Install a broker and use the attached configuration.
 2. Use the 5.8.0 example ant script to consume from the queues, 
 TopicQueueRouted.1 - 5. eg:
ant consumer -Durl=tcp://localhost:61616 -Dsubject=TopicQueueRouted.1 
 -Duser=admin -Dpassword=admin -Dmax=-1
 3. Use the modified version of 5.8.0 example ant script (attached) to send 
 messages to topics, TopicRouted.1 - 5, eg:
ant producer 
 -Durl='tcp://localhost:61616?jms.useAsyncSend=truewireFormat.tightEncodingEnabled=falsekeepAlive=truewireFormat.maxInactivityDuration=6socketBufferSize=32768'
  -Dsubject=TopicRouted.1 -Duser=admin -Dpassword=admin -Dmax=1 -Dtopic=true 
 -DsleepTime=0 -Dmax=1 -DmessageSize=5000
 This modified version of the script prints the number of messages per second 
 and prints it to the console.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)