Qpid Java Broker StatisticsPage added by Andrew KennedyBroker StatisticsThe following statistics are provided as a basic useful minimum.
These statistics will be generated for both delivered and received messages, per connection and aggregated per virtualhost and also for the entire broker. Delivered messages are those that have been sent to a client by the broker, and may be counted more than once die to rollbacks and re-delivery. Received messages are those that are published by a client and received by the broker. MechanismThe statistics are generated using a sample counter, which is thread-safe and is triggered on incoming message events. The totals for messages and data (bytes) are recorded as type long and the rates are recorded as double since they may be fractional. The default sample period is one second, during which events are recorded cumulatively. This can be changed for all counters in the broker by setting the system property qpid.statistics.samplePeriod. Basically, rates are calculated as an average over this sample period, so if the traffic is very bursty, a small sample period will fail to capture events most of the time, and you will end up with inflated peak rates. Since we are calculating rates per-second, the sample period is 1000ms as a default, but if this is increased to, say, 10000ms then messages will be counted over ten second periods, and the resulting totals divided by ten to give the rate.
When all statistics generation is enabled, there will be 12 counters that are used, since statistics are calculated for the product of the following sets: { Broker, Virtualhost, Connection } x { Delivery, Receipt } x { Messages, Data } ReportingIf statistics are being generated for the broker or for virtualhosts then periodic reporting can be enabled. This is generated at specified intervals, with the option to reset the statistics after outputting the current data. [Statistics-Reporting] BRK-1008 : received : 4.395 kB/s peak : 4500 bytes total [Statistics-Reporting] BRK-1008 : delivered : 4.395 kB/s peak : 4500 bytes total [Statistics-Reporting] BRK-1009 : received : 45 msg/s peak : 45 msgs total [Statistics-Reporting] BRK-1009 : delivered : 45 msg/s peak : 45 msgs total [Statistics-Reporting] VHT-1003 : localhost : received : 1.465 kB/s peak : 1500 bytes total [Statistics-Reporting] VHT-1003 : localhost : delivered : 1.465 kB/s peak : 1500 bytes total [Statistics-Reporting] VHT-1004 : localhost : received : 15 msg/s peak : 15 msgs total [Statistics-Reporting] VHT-1004 : localhost : delivered : 15 msg/s peak : 15 msgs total [Statistics-Reporting] VHT-1003 : test : received : 0.977 kB/s peak : 1000 bytes total [Statistics-Reporting] VHT-1003 : test : delivered : 0.977 kB/s peak : 1000 bytes total [Statistics-Reporting] VHT-1004 : test : received : 10 msg/s peak : 10 msgs total [Statistics-Reporting] VHT-1004 : test : delivered : 10 msg/s peak : 10 msgs total [Statistics-Reporting] VHT-1003 : development : received : 1.953 kB/s peak : 2000 bytes total [Statistics-Reporting] VHT-1003 : development : delivered : 1.953 kB/s peak : 2000 bytes total [Statistics-Reporting] VHT-1003 : development : received : 1.953 kB/s peak : 2000 bytes total [Statistics-Reporting] VHT-1003 : development : delivered : 1.953 kB/s peak : 2000 bytes total [Statistics-Reporting] VHT-1004 : development : received : 20 msg/s peak : 20 msgs total [Statistics-Reporting] VHT-1004 : development : delivered : 20 msg/s peak : 20 msgs total Sample statistics report log messages
ConfigurationThe statistics generation will be configured via the config.xml broker configuration mechanism. config.xml
<statistics>
<generation>
<!-- default false -->
<broker>true</broker>
<virtualhosts>true</virtualhosts>
<connections>true</connections>
</generation>
<reporting>
<period>3600</period><!-- seconds -->
<reset>true</reset>
</reporting>
</statistics>
It is not possible to enable statistics generation for a single virtualhost - all virtualhosts will generate statistics if the //statistics/generation/virtualhosts element is set to true. It only makes sense to enable reporting if either broker or virtualhosts have statistics generation enabled also. However, the broker will simply ignore the reporting configuration if no statistics are being generated. If the //statistics/reporting/reset element is set to true then after reporting on the statistics in the log, the statistics will be reset to zero for the entire broker. Even if statistics generation is completely disabled, it is still possible to activate statistics on an individual connection, while the broker is running. The JMX attribute statisticsEnabled on a connection MBean can be set to true which will start statistics generation, and totals and rates will be calculated from this point onward, or until it is set to false again. Additionally, the following two system properties can be set to configure the statistics counter:
These two properties are exposed on the StatisticsCounter class as static fields DEFAULT_SAMPLE_PERIOD and DISABLE_STATISTICS. DesignStatisticsCounter ClassThis implements the counting of event data and generates the total and rate statistics. There should be one instance of this class per type of statistic, such as messages or bytes. The instance methods that are called to add an event are:
There are three constructors:
These are chained in that order, using a default name of counter and the default sample period of 2000 ms or set in the qpid.statistics.samplePeriod property. To retrieve the data, there are methods to return the current rate, peak rate and total, as well as the start time, sample period and name of the counter, and also a method to reset the counter. StatisticsGatherer InterfaceThis is implemented by the broker business object that is generating statistics. It provides the following methods:
These statistics are exposed using the separate JMX Mbean interfaces detailed below, which calls these methods to retrieve the underlying StatisticsCounter objects and return their attributes. This interface gives a standard way for parts of the broker to set up and configure statistics generation. When creating these objects, there should be a parent/child relationship between them, normally from broker to virtualhost to connection. This means that the lowest level gatherer can record statistics (if enabled) and also pass on the notification to its parent object to allow higher level aggregation of statistics. When resetting statistics, this works in the opposite direction, with higher level gatherers also resetting all of their child objects. Note that this parent/choild relationship is never explicitly specified, and is dependant on the implementation of registerMessageDelivery and resetStatistics in order to allow more flexibility. JMX InterfaceThe Qpid JMX interface level that supports statistics is 1.9. Each object (MBean) that can generate statistics will add the following attributes and operations:
The following MBeans have had statistics attributes added:
The JMX attributes that record statistics are always present, and will have a value of 0/0.0 if generation is not enabled, and the statisticsEnabled attribute will be set to false.
UsageThe JMX JConsole application can be used to view the attributes, both as discrete values or as graphs. Sample output is shown below, illustrating the virtualhost statistics for a broker with a producer and a consumer attached and sending messsages. Unable to render embedded object: File (jconsole-statistics-zero.png) not found. Unable to render embedded object: File (jconsole-statistics-graphs.png) not found. TestingOne caveat related to testing is that these statistics, apart from totals, are by their nature time dependent, and subject to race conditions, where sending messages as part of a test can overlap two sample periods, instead of one, causing seemingly incorrect results. UnitThe StatisticsCounter class is suitable for unit testing. Tests checking class creation and operation check the total, rate and peak data, as well as other class properties. It is difficult to write a test to provably show thread safety when updating the counter with new events, but there are tests to check that out-of-order events will not cause problems with the data generated. SystemSystem testing covers the generation of statistics using a running broker, as well as the configuration mechanisms and the operation of the JMX interface to the data and the operational logging output. The following system tests have been written, all using a shared parent test case class, MessageStatisticsTestCase which sets up the JMX connection to the broker.
These tests can be run with the following command, from the systests module directory: PerformanceTo measure performance, a release of 2.6.0.7 was generated, using the standard release mechanism. The data should be compared with the previous results for the 2.6.0.6 release. The performance test suite was run on noc-qpiddev02.jpmorganchase.com with the broker running on noc-qpiddev01.jpmorganchase.com. Statistics were enabled on the broker by editing the etc/persistent_config.xml file to set statistics generation to true for the broker, virtualhosts and connections, as well as to report every hour (without resetting the statistics). The Disabled configuration used the provided default configuration file, and the Broker Only configuration used the same as the first set of tests, but with the //statistics/generation/virtualhosts and //statistics/generation/connections elements set to false. The following command was executed for each of the configurations, and the results are aggregated into the tables below:
Initial ResultsThese results use the code from SVN revision r1033077, annd were generated on the 09 and 10 November 2010. Throughput Numbers
There was no obvious difference between broker only and all statistics, so that testing was discontinued. Comparison with previous release:
Analysis
Further testing with different combinations of broker, virtualhost and connection statistics enabled will need to be carried out to determine where the greatest penalty in performance lies. Based on this, there may be changes required to the StatisticsCounter class, or the propagation mechanism that links child and parent counter objects. It is also important to determine what the maximum acceptable loss of throughput is to a customer. It would seem desirable to have less that 5% decreate in throughput due to statistics generation, however it should be noted that a performance penalty is inevitable when adding code to the message delivery and receipt paths in the broker, and the aim is only to minimise this, not to eliminate it. Further TestingThe 7% performance loss was felt to be unacceptable, so modifications were made to the StatisticsCounter code to reduce the time spend carrying out computations. The following changes were made:
Throughput NumbersThe same 2.6.0.7 broker with QPID_OPTS set to -Dqpid.statistics.samplePeriod=N where N is 2000 or 5000.
Old is an unmodified 2.6.0.7 broker, the new broker has QPID_OPTS set to -Dqpid.statistics.samplePeriod=5000 and some changes to the StatisticsCounter class that move computation to the end of the sample period, all three have QPID_JAVA_MEM set to -Xmx2g. Results are the average of three runs.
More TestsTwo 5 minute runs of a simple test, producing and consuming messages as fast as possible in a single thread. This was run on a HP DL585, Quad dual-core 2.2GHz Opteron, 8Gb, hence the difference in reported rates.
ProfilingTo properly determine hot-spots in the code, a profiling run was made using the YourKit profiler. The simple test used in the previous section was run with the profiler attached, and CPU statistics were collated. The profiler detected the following hotspots:
Since the AtomicLong calls are made inside registerEvent, it seems that this is where the CPU is spending the extra time that is causing the slowdown. In order to mitigate this, the registerEvent method was changed to be synchronized and the internal logic changed to use primitive long types and standard arithmentic operations. After this change, the profiler no longer showed the registerEvent as a hotspot in the code when another test run was made. The updated code was copied to the broker being used for throughput testing, and the TTBT-NA-Qpid-01 test was re-run, with results shown below (again, an average of three runs was taken).
Finally, the single-threaded test was also run against 2.6.0.6 and the latest 2.6.0.7 build, with the following results:
In order to more fully test this, the script which starts the TTBT-NA test, and the Throughput.sh sctipt itself, were copied, and modified. A series of scripts, STAT-XX.sh, was created, consisting of identical calls to the TTBT-NA test apart from the number of consumers, which varied from 01 to 16 and is represented by XX, and a Statistics.sh script was created which calls each of these in turn. The STAT throughput tests were run against a standard 2.6.0.6 broker and the 2.6.0.7 build with all statistics generation enabled. It is expected that the difference in throughput should increase as the number of consumers increases. Note that these tests were only run once so are not as accurate as the averaged, multi-run datasets. 2.6.0.6 Throughput Numbers$ QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/persistent_config.xml
2.6.0.7 Throughput Numbers$ QPID_OPTS=-Dqpid.statistics.samplePeriod=5000 QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/persistent_config.xml
Although it is difficult to draw any real or meaningful conclusions from these results, note that single consumer performance degradation is negligible. A useful further test would be to run the 2.6.0.7 broker with only connection-level statistics being generated? This ought to remove the contention for the lock on the counters for virtualhosts and the broker when many consumers are being delivered to. 2.6.0.7 Throughput Numbers$ QPID_OPTS=-Dqpid.statistics.samplePeriod=5000 QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/connection_config.xml
Conclusion?It would appear that there is a penalty of a few percent (up to 4%) for a single thread, rising to over 10% when there are many threads. Further testing will be required to determine how the number of connections (i.e. threads) causes the performance to vary. This is, of course, hampered by the fact that the throughput tests exhibit up to 4% variability between the minimum and maximum results for a particular test. Additionally, it is unclear which of the two mechanisms (AtomicLong versus synchronized registerEvent method) used in the counter is preferable. For the many-threaded, multi-consumer/producer case the AtomicLong seems the better choice, for a single producer/consumer the synchronized registerEvent method is better. Another point to note is that the actual per-message latency increase is quite small, and it is only when sending large numbers of messages using many connections that the degradation comes into play, and the ratio of messages to data is small, i.e. the data throughput does not concern us, it is purely the number of messages. This means that for most applications there should not be a noticeable difference in performance, with any small change in per message latency being eaten up in noise. Another point to note is that in these test garbage collection Further ModificationsThis data was generated after modifying the broker code to place statistics recording for message delivery after the message is enqueued at the broker (or at least, after an attempt has been made to do so).
As a cross-check, the tests were repeated using a stock 2.6.0.6 broker, then again (twice actually), with the 2.6.0.7 broker, like the first time, in case there are any environmental factors that are skewing the results. And finally, with the 2.6.0.7 broker, but with statisatics generation disabled. These results show that the persistent testing has results that are about the same for both versions, whether the statistics are enbled or disabled. This is due to the naturally lower throughput causing less contention and allowing the statistics generation to run unimpeded. The transient tests get worse with statistics generation enabled, due to the much higher message throughput and the fact that may messages are trying to update the counters at the same time. LatenceyAs a general recommendation, I would suggest that statistics generation is only enabled in cases of low message (count, not size) traffic, where the broker is not being 100% ustilised in terms of CPU. Persistant messaging is nescessarily I/O bound, interms of disk access, and this is usually the determining factor in message latency, so adding the statistics counters places no large additional burden, and will not add a large amount to the latency, certainly less that 5% degradation. It is obvious that any additional processing that talkes place during the act of delivery or receipt of a message will inevitably add latency and decrease throughput on the broker. If a broker is already operati ng at peak performance, something that is only possible with transient messages, which are only dependant on the amount of CPU and RAM available, then the enabling of statistics counters will have a greater effect, possibly degrading pewrformance by over 10%. Tasks
QPID-2932 Add statistics generation for broker message delivery
Change Notification Preferences
View Online
|
Add Comment
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- [CONF] Apache Qpid > Qpid Java Broker Statistics confluence
