[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380418#comment-16380418 ]
Chris Lohfink commented on CASSANDRA-14281: ------------------------------------------- https://github.com/apache/cassandra/blob/6dfd11c30a9c85581b77c93cfcdbef37a5d497c6/src/java/org/apache/cassandra/utils/EstimatedHistogram.java is used to back the Histogram and Timer metrics, I think the perf impacts might be from the wrapper introduced in CASSANDRA-11752 https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/DecayingEstimatedHistogramReservoir.java Particularly the landmark rescaling lock. Thats a spike every ~30 min if I recall > LatencyMetrics performance > -------------------------- > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Michael Burman > Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > For example, for each CFS write we do first write to the CFS's metrics and > then to Keyspace's metrics and finally globalMetrics. The way Timer is built > is to maintain a Histogram and a Meter and update both when Timer is updated. > The Meter then updates 4 different values (1 minute rate, 5 minute rate, 15 > minutes rate and a counter). > So for each CFS write we actually do 15 different counter updates. And then > of course maintain their states at the same time while writing. These > operations are very slow when combined. > A small JMH benchmark doing an update against a single LatencyMetrics with 4 > threads gives us around 5.2M updates / second. With the current writeLatency > metric (having 2 parents) we get only 1.6M updates / second. > I'm proposing to update this to use a small circular buffer HdrHistogram > implementation. We would maintain a rolling buffer with last 15 minutes of > histograms (30 seconds per histogram) and update the correct bucket each > time. When requesting metrics we would then merge requested amount of buckets > to a new histogram and parse results from it. This moves some of the load > from writing of the metrics to reading them (which is much more infrequent > operation), including the parent metrics. It also allows us to maintain the > current metrics structure - if we wish to do so. > My prototype with this approach improves the performance to around 13.8M > updates/second, thus almost 9 times faster than the current approach. We also > maintain HdrHistogram already in the Cassandra's lib so there's no new > dependencies to add (java-driver also uses it). > FUTURE: > This opens up some possibilities, such as replacing all dropwizard > Histograms/Meters with the new approach (to reduce overhead elsewhere in the > codebase). It would also allow us to supply downloadable histograms directly > from the Cassandra or store them to the disk each time a bucket is filled if > user wishes to monitor latency history or graph all percentiles. > HdrHistogram also provides the ability to "fix" these histograms with pause > tracking, such as GC pauses which we currently can't do (as dropwizard > histograms can't be merged). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org