[
https://issues.apache.org/jira/browse/AMQ-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Huang updated AMQ-5486:
---------------------------------
Description:
There's about 20 topics with virtual topic enabled, hundreds of
comsumers/producers connected to MQ on NIO transport connector. During the run
there're about 12000 msg flow in per second, not a very high rate, but ActiveMQ
consumes a lot of CPU resource (about 600%~1000%). To find out what's the most
CPU consuming code path, I use JProfiler to dig into the process.
Among all the NIO worker threads, most of them were frequently blocked and did
a little job between the 'unblocked' time. While they're expected spend most of
their time slices on waiting for work item and processed them.
!nio_worker_blocked_frequently.png!
After reviewing the monitor usage history and stats, I think these NIO workers
were competing fiercely with each other on executing a synchronized method
(DestinationMap::get), which is also the most hot spot in the program . I also
notice that the caller AbstractRegion::getDestinations acquires a read lock
before calling it, so I guess this could be a left out, read lock is the actual
lock type required here.
!monitor_usage_stats.png!
!nio_worker_blocked_1.png!
!nio_worker_blocked_2.png!
It's too difficult for me to list all critical sections between NIO workers, or
between NIO workers and BrokerService which adds up to the overall synchronize
overhead. So I attach the relevant info, with the hope of finding a complete
solution to this.
was:
There's about 20 topics with virtual topic enabled, hundreds of
comsumers/producers connected to MQ on NIO transport connector. During the run
there're about 12000 msg flow in per second, not a very high rate, but ActiveMQ
consumes a lot of CPU resource (about 600%~1000%). To find out what's the most
CPU consuming code path, I use JProfiler to dig into the process.
Among all the NIO worker threads, most of them were frequently blocked and did
a little job between the 'unblocked' time. While they're expected spend most of
their time slices on waiting for work item and processed them.
!nio_worker_blocked_frequently.png!
After reviewing the monitor usage history and stats, I think these NIO workers
were competing fiercely with each other on executing a synchronized method
(DestinationMap::get), which is also the most hot spot in the program . I also
notice that the caller AbstractRegion::getDestinations acquires a read lock
before calling it, so I guess this could be a left out, read lock is the actual
lock type required here.
!monitor_usage_stats.png!
!nio_worker_blocked_1.png!
!nio_worker_blocked_2.png!
As a rookie to Java, it's too difficult for me to list all critical sections
between NIO workers, or between NIO workers and BrokerService which adds up to
the overall synchronize overhead. So I attach the relevant info, with the hope
of finding a feasible solution to this.
> Thread synchronization overhead is unexpectedly high
> ----------------------------------------------------
>
> Key: AMQ-5486
> URL: https://issues.apache.org/jira/browse/AMQ-5486
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.9.1
> Environment: UbuntuServer 12.04 x86_64, Linux Kernel 3.20.23, OpenJDK
> 1.7.0_51 64-Bit Server VM, Xms 2G, Xms 8G, CPU: E5-2620 v2 X 2, 64 GB RAM
> Reporter: Benjamin Huang
> Attachments: Method_Statistics.html, Method_Statistics.html,
> Monitor_History.zip, Monitor_Usage_Statistics.html, monitor_usage_stats.png,
> nio_worker_blocked_1.png, nio_worker_blocked_2.png,
> nio_worker_blocked_frequently.png
>
>
> There's about 20 topics with virtual topic enabled, hundreds of
> comsumers/producers connected to MQ on NIO transport connector. During the
> run there're about 12000 msg flow in per second, not a very high rate, but
> ActiveMQ consumes a lot of CPU resource (about 600%~1000%). To find out
> what's the most CPU consuming code path, I use JProfiler to dig into the
> process.
> Among all the NIO worker threads, most of them were frequently blocked and
> did a little job between the 'unblocked' time. While they're expected spend
> most of their time slices on waiting for work item and processed them.
> !nio_worker_blocked_frequently.png!
> After reviewing the monitor usage history and stats, I think these NIO
> workers were competing fiercely with each other on executing a synchronized
> method (DestinationMap::get), which is also the most hot spot in the program
> . I also notice that the caller AbstractRegion::getDestinations acquires a
> read lock before calling it, so I guess this could be a left out, read lock
> is the actual lock type required here.
> !monitor_usage_stats.png!
> !nio_worker_blocked_1.png!
> !nio_worker_blocked_2.png!
> It's too difficult for me to list all critical sections between NIO workers,
> or between NIO workers and BrokerService which adds up to the overall
> synchronize overhead. So I attach the relevant info, with the hope of finding
> a complete solution to this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)