[ https://issues.apache.org/jira/browse/QPID-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644766#comment-16644766 ]
Alex Rudyy commented on QPID-8238: ---------------------------------- Apart from lover throughput of NIO layer, there is another aspect to the performance issues I had failed to notice before. When running my performance tests for longer amount of time I started to see long GC pauses on cleaning of JVM Metaspace: {noformat} 2018-10-10T05:32:11.250-0400: 1918.070: [GC (Allocation Failure) 2018-10-10T05:32:11.250-0400: 1918.071: [ParNew: 3145728K->3145728K(3145728K), 0.0000665 secs]2018-10-10T05:32:11.250-0400: 1918.071: [CMS2018-10-10T05:32:11.363-0400: 1918.183: [CMS-concurrent-mark: 1.696/1.700 secs] [Times: user=51.70 sys=1.93, real=1.70 secs] (concurrent mode failure): 6418793K->5067015K(6990528K), 43.9299868 secs] 9564521K->5067015K(10136256K), [Metaspace: 37679K->37679K(1083392K)], 43.9303498 secs] [Times: user=45.39 sys=0.00, real=43.92 secs] 2018-10-10T05:32:55.181-0400: 1962.001: Total time for which application threads were stopped: 43.9323704 seconds, Stopping threads took: 0.0004930 seconds 2018-10-10T05:33:01.804-0400: 1968.624: [GC (Allocation Failure) 2018-10-10T05:33:01.804-0400: 1968.624: [ParNew: 3145728K->349504K(3145728K), 1.1281568 secs] 8881780K->6564365K(10136256K), 1.1284601 secs] [Times: user=47.24 sys=0.01, real=1.13 secs] 2018-10-10T05:33:02.933-0400: 1969.753: Total time for which application threads were stopped: 1.1302133 seconds, Stopping threads took: 0.0003888 seconds 2018-10-10T05:33:04.515-0400: 1971.335: [GC (Allocation Failure) 2018-10-10T05:33:04.515-0400: 1971.335: [ParNew: 3145728K->349504K(3145728K), 1.1778553 secs] 9360589K->7056409K(10136256K), 1.1781560 secs] [Times: user=50.64 sys=0.00, real=1.18 secs] 2018-10-10T05:33:05.693-0400: 1972.513: Total time for which application threads were stopped: 1.1804099 seconds, Stopping threads took: 0.0007907 seconds 2018-10-10T05:33:07.300-0400: 1974.120: [GC (Allocation Failure) 2018-10-10T05:33:07.300-0400: 1974.120: [ParNew: 3145728K->3145728K(3145728K), 0.0000554 secs]2018-10-10T05:33:07.300-0400: 1974.120: [CMS2018-10-10T05:33:13.021-0400: 1979.841: [CMS-concurrent-preclean: 8.160/9.850 secs] [Times: user=110.86 sys=3.14, real=9.85 secs] (concurrent mode failure): 6706905K->5608432K(6990528K), 48.4980083 secs] 9852633K->5608432K(10136256K), [Metaspace: 37689K->37689K(1083392K)], 48.4983623 secs] [Times: user=48.53 sys=0.00, real=48.49 secs] 2018-10-10T05:33:55.798-0400: 2022.619: Total time for which application threads were stopped: 48.5008915 seconds, Stopping threads took: 0.0008321 seconds 2018-10-10T05:33:59.596-0400: 2026.417: [GC (Allocation Failure) 2018-10-10T05:33:59.596-0400: 2026.417: [ParNew: 3145728K->349504K(3145728K), 1.2127988 secs] 8892795K->6665263K(10136256K), 1.2130993 secs] [Times: user=46.20 sys=0.10, real=1.21 secs] 2018-10-10T05:34:00.810-0400: 2027.630: Total time for which application threads were stopped: 1.2151635 seconds, Stopping threads took: 0.0006465 seconds 2018-10-10T05:34:02.374-0400: 2029.194: [GC (Allocation Failure) 2018-10-10T05:34:02.374-0400: 2029.194: [ParNew: 3145728K->3145728K(3145728K), 0.0000684 secs]2018-10-10T05:34:02.374-0400: 2029.195: [CMS2018-10-10T05:34:04.462-0400: 2031.282: [CMS-concurrent-mark: 3.616/3.626 secs] [Times: user=63.40 sys=3.03, real=3.62 secs] (concurrent mode failure): 6315759K->5859983K(6990528K), 44.3028251 secs] 9461487K->5859983K(10136256K), [Metaspace: 37695K->37695K(1083392K)], 44.3032096 secs] [Times: user=60.53 sys=0.71, real=44.30 secs] 2018-10-10T05:34:46.678-0400: 2073.498: Total time for which application threads were stopped: 44.3052052 seconds, Stopping threads took: 0.0005032 seconds {noformat} The nature of the pauses is not clear to me at the moment. I need to investigate this further. > [Broker-J] Improve performance of asynchronous publishing of transient > messages into topic exchange having queues bound using non-overlapping > selectors > -------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: QPID-8238 > URL: https://issues.apache.org/jira/browse/QPID-8238 > Project: Qpid > Issue Type: Improvement > Components: Broker-J > Affects Versions: qpid-java-broker-7.0.6 > Reporter: Alex Rudyy > Priority: Major > Fix For: qpid-java-broker-7.1.0, qpid-java-broker-7.0.7 > > Attachments: > 0001-QPID-8238-Use-java.lang.String-for-keys-and-values-i.patch > > > The performance of asynchronous publishing of transient messages into topic > exchange which routes messages into queues bound using non-overlapping > selectors is 2-3 times slower than performance of 0.32 broker. The > performance degradation is observed with AMQP 0.9, though, I suspect that the > AMQP 0-10 protocol could be affected as well. > I was running tests with 10 concurrent producers publishing messages on > separate connections using the same routing key into 10 different queues > (subscribers queues) bound to the exchange using non-overlapping selectors. > My testing showed that performance of 7.0 broker for this particular use case > was 2-3 times worse than performance of 0.32 broker. > The following factors contributed to degradation of performance: > • Copying data from direct memory into heap memory whilst decoding > message headers. Due to this factor, the decoding of message headers is > around twice slower. It seems it contributes around 70% to total performance > degradation > • The message routing algorithm is slower due to need to support a new > feature to route messages into bound exchanges (in addition to queues) using > replacement routing key. > • AMQ short strings caching contributes 5-10% to total performance > degradation. The caching was added to manage heap space more efficiently. > The numbers provided here could be inaccurate due instrumentation overhead > whilst profiling the issue. > Potentially, caching can be turned off but that will not improve performance > much. > On other hand, adding of additional caching of strings to amqp-short-strings > would improve the performance a bit. Whilst evaluating selectors, the fields > used in selector expressions are represented as java strings but they get > converted every time into amqp-short-strings when looking up for message > header values. If 10 queues are bound to the exchange using the same binding > key, the selector expression is evaluated 10 times for the incoming message. > Thus, all selector field names are get converted into amqp-short-strings 10 > times as well. It seems adding caching here can improve the performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org