[
https://issues.apache.org/jira/browse/CASSANDRA-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930432#comment-17930432
]
Dmitry Konstantinov edited comment on CASSANDRA-20250 at 2/25/25 8:03 PM:
--------------------------------------------------------------------------
I looked through Reporter implementations listed here
[https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters]
# Almost all of them are dead/archived.
# All of them inherent from ScheduledReporter and implements:
{code:java}
public abstract void report(SortedMap<String, Gauge> gauges,
SortedMap<String, Counter> counters,
SortedMap<String, Histogram> histograms,
SortedMap<String, Meter> meters,
SortedMap<String, Timer> timers);{code}
method. So, I suppose there is no way to to change the metrics API without
breaking at least source compatibility with reporter implementations.
Several examples of the reporters:
*
[https://github.com/coursera/metrics-datadog/blob/master/metrics-datadog/src/main/java/org/coursera/metrics/datadog/DatadogReporter.java]
*
[https://github.com/elastic/elasticsearch-metrics-reporter-java/blob/master/src/main/java/org/elasticsearch/metrics/ElasticsearchReporter.java]
*
[https://github.com/hawkular/hawkular-dropwizard-reporter/blob/master/hawkular-dropwizard-reporter/src/main/java/org/hawkular/metrics/dropwizard/HawkularReporter.java]
*
[https://github.com/iZettle/dropwizard-metrics-influxdb/blob/master/metrics-influxdb/src/main/java/com/izettle/metrics/influxdb/InfluxDbReporter.java]
*
[https://github.com/e-gineering/metrics-instrumental/blob/master/src/main/java/com/e_gineering/metrics/instrumental/InstrumentalReporter.java]
*
[https://github.com/newrelic/dropwizard-metrics-newrelic/blob/main/src/main/java/com/codahale/metrics/newrelic/NewRelicReporter.java]
*
[https://github.com/zenmoto/metrics-splunk/blob/master/src/main/java/io/github/zenmoto/metrics/SplunkReporter.java]
*
[https://github.com/hengyunabc/metrics-zabbix/blob/master/src/main/java/io/github/hengyunabc/metrics/ZabbixReporter.java]
I have also checked 5.x version (it looks like it is paused) -
[https://github.com/dropwizard/metrics/tree/release/5.0.x/metrics-core/src/main/java/io/dropwizard/metrics5]
the concerning API is almost the same, the metrics are still concrete classes
and ScheduledReporter and listeners have dependencies to them in API.
----
Also I also looked through the already reported issues and found the following
similar and unsuccessful :( attempt to introduce interfaces for metrics:
[https://github.com/dropwizard/metrics/issues/2186]
as well as other older attempts:
* [https://github.com/dropwizard/metrics/issues/252]
* [https://github.com/dropwizard/metrics/issues/264]
* [https://github.com/dropwizard/metrics/issues/703]
* [https://github.com/dropwizard/metrics/pull/487]
* [https://github.com/dropwizard/metrics/issues/479]
* [https://github.com/dropwizard/metrics/issues/253]
so, I doubt if one more request can change something in this area for
Dropwizard metrics.
The only non-breaking way which I see for them to introduce such change is to
make existing Counter/Meter/Histogram/Timer classes delegating an actual
implementation to interfaces but I see quite a lot of resistance in past
regarding the ability to support custom metrics implementation..
was (Author: dnk):
I looked through Reporter implementations listed here
[https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters]
# Almost all of them are dead/archived.
# All of them inherent from ScheduledReporter and implements:
{code:java}
public abstract void report(SortedMap<String, Gauge> gauges,
SortedMap<String, Counter> counters,
SortedMap<String, Histogram> histograms,
SortedMap<String, Meter> meters,
SortedMap<String, Timer> timers);{code}
method. So, I suppose there is no way to to change the metrics API without
breaking at least source compatibility with reporter implementations.
Several examples of the reporters:
*
[https://github.com/coursera/metrics-datadog/blob/master/metrics-datadog/src/main/java/org/coursera/metrics/datadog/DatadogReporter.java]
*
[https://github.com/elastic/elasticsearch-metrics-reporter-java/blob/master/src/main/java/org/elasticsearch/metrics/ElasticsearchReporter.java]
*
[https://github.com/hawkular/hawkular-dropwizard-reporter/blob/master/hawkular-dropwizard-reporter/src/main/java/org/hawkular/metrics/dropwizard/HawkularReporter.java]
*
[https://github.com/iZettle/dropwizard-metrics-influxdb/blob/master/metrics-influxdb/src/main/java/com/izettle/metrics/influxdb/InfluxDbReporter.java]
*
[https://github.com/e-gineering/metrics-instrumental/blob/master/src/main/java/com/e_gineering/metrics/instrumental/InstrumentalReporter.java]
*
[https://github.com/newrelic/dropwizard-metrics-newrelic/blob/main/src/main/java/com/codahale/metrics/newrelic/NewRelicReporter.java]
*
[https://github.com/zenmoto/metrics-splunk/blob/master/src/main/java/io/github/zenmoto/metrics/SplunkReporter.java]
*
[https://github.com/hengyunabc/metrics-zabbix/blob/master/src/main/java/io/github/hengyunabc/metrics/ZabbixReporter.java]
I have also checked 5.x version (it looks like it is paused) -
[https://github.com/dropwizard/metrics/tree/release/5.0.x/metrics-core/src/main/java/io/dropwizard/metrics5]
the concerning API is almost the same, the metrics are still concrete classes
and ScheduledReporter and listeners have dependencies to them in API.
----
Also I also looked through the already reported issues and found the following
similar and unsuccessful :( attempt to introduce interfaces for metrics:
[https://github.com/dropwizard/metrics/issues/2186]
as well as other older attempts:
* [https://github.com/dropwizard/metrics/issues/252]
* [https://github.com/dropwizard/metrics/issues/264]
* [https://github.com/dropwizard/metrics/issues/703]
* [https://github.com/dropwizard/metrics/pull/487]
* [https://github.com/dropwizard/metrics/issues/479]
* [https://github.com/dropwizard/metrics/issues/253]
so, I doubt if one more request can change something in this area for
Dropwizard metrics. The only non-breaking way which I see for them to introduce
such change is to make existing Counter/Meter/Histogram/Timer classes
delegating an actual implementation to interfaces but I see quite a lot of
resistance in past regarding the ability to support custom metrics
implementation..
> Optimize Counter, Meter and Histogram metrics using thread local counters
> -------------------------------------------------------------------------
>
> Key: CASSANDRA-20250
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20250
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Observability/Metrics
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: 5.1_profile_cpu.html,
> 5.1_profile_cpu_without_metrics.html, 5.1_tl4_profile_cpu.html,
> Histogram_AtomicLong.png, async_profiler_cpu_profiles.zip,
> cpu_profile_insert.html, image-2025-02-18-23-22-19-983.png, jmh-result.json,
> vmstat.log, vmstat_without_metrics.log
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Cassandra has a lot of metrics collected, many of them are collected per
> table, so their instance number is multiplied by number of tables. From one
> side it gives a better observability, from another side metrics are not for
> free, there is an overhead associated with them:
> 1) CPU overhead: in case of simple CPU bound load: I already see like 5.5% of
> total CPU spent for metrics in cpu framegraphs for read load and 11% for
> write load.
> Example: [^cpu_profile_insert.html] (search by "codahale" pattern). The
> framegraph is captured using Async profiler build:
> async-profiler-3.0-29ee888-linux-x64
> 2) memory overhead: we spend memory for entities used to aggregate metrics
> such as LongAdders and reservoirs + for MBeans (String concatenation within
> object names is a major cause of it, for each table+metric name combination a
> new String is created)
> LongAdder is used by Dropwizard Counter/Meter and Histogram metrics for
> counting purposes. It has severe memory overhead + while has a better scaling
> than AtomicLong we still have to pay some cost for the concurrent operations.
> Additionally, in case of Meter - we have a non-optimal behaviour when we
> count the same things several times.
> The idea (suggested by [~benedict]) is to switch to thread-local counters
> which we can store in a common thread-local array to reduce memory overhead.
> In this way we can avoid concurrent update overheads/contentions and to
> reduce memory footprint as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]