[ 
https://issues.apache.org/jira/browse/CASSANDRA-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925301#comment-17925301
 ] 

Dmitry Konstantinov edited comment on CASSANDRA-20250 at 2/8/25 4:42 PM:
-------------------------------------------------------------------------

JMH results, same tests but using AverageTime mode

Laptop  (MacOS, OpenJDK11, 2,6 GHz 6-Core Intel Core i7)
{code:java}
4 threads ==
     [java] Benchmark                                        (type)  Mode  Cnt  
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment             LongAdder  avgt   16  
8.488 ± 1.140  ns/op
     [java] ThreadLocalMetricsBench.increment          LazySetArray  avgt   16  
4.813 ± 0.821  ns/op
     [java] ThreadLocalMetricsBench.increment        PiggybackArray  avgt   16  
4.430 ± 0.218  ns/op
{code}
Server (VM, Linux, OpenJdk-11.0.26+4, Intel(R) Xeon(R) CPU E5-2680 v4 @ 
2.40GHz, 16 cores)
{code:java}
server, 4 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.457 ? 0.340  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.587 ? 0.003  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.293 ? 0.002  ns/op

server, 8 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.136 ? 0.003  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.707 ? 0.107  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.299 ? 0.017  ns/op

server, 16 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.160 ? 0.018  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.637 ? 0.016  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.611 ? 0.029  ns/op{code}

I am checking now the number of LongAdder operations which we have for the 
write Cassandra flow per request (in the e2e benchmark)  + I will try to run 
e2e with disable metrics to ensure that async profiler graph is not biased 
(note: I used -XX:+DebugNonSafepoints when I captured it).


was (Author: dnk):
JMH results, same tests but using AverageTime mode

Laptop  (MacOS, OpenJDK11, 2,6 GHz 6-Core Intel Core i7)
{code:java}
4 threads ==
     [java] Benchmark                                        (type)  Mode  Cnt  
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment             LongAdder  avgt   16  
8.488 ± 1.140  ns/op
     [java] ThreadLocalMetricsBench.increment          LazySetArray  avgt   16  
4.813 ± 0.821  ns/op
     [java] ThreadLocalMetricsBench.increment        PiggybackArray  avgt   16  
4.430 ± 0.218  ns/op
{code}
Server (Linux, OpenJdk-11.0.26+4, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 16 
cores)
{code:java}
server, 4 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.457 ? 0.340  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.587 ? 0.003  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.293 ? 0.002  ns/op

server, 8 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.136 ? 0.003  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.707 ? 0.107  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.299 ? 0.017  ns/op

server, 16 threads ==
     [java] Benchmark                                  (type)  Mode  Cnt   
Score   Error  Units
     [java] ThreadLocalMetricsBench.increment       LongAdder  avgt   16  
11.160 ? 0.018  ns/op
     [java] ThreadLocalMetricsBench.increment    LazySetArray  avgt   16   
5.637 ? 0.016  ns/op
     [java] ThreadLocalMetricsBench.increment  PiggybackArray  avgt   16   
4.611 ? 0.029  ns/op{code}

I am checking now the number of LongAdder operations which we have for the 
write Cassandra flow per request (in the e2e benchmark)  + I will try to run 
e2e with disable metrics to ensure that async profiler graph is not biased 
(note: I used -XX:+DebugNonSafepoints when I captured it).

> Provide the ability to disable specific metrics collection
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-20250
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20250
>             Project: Apache Cassandra
>          Issue Type: New Feature
>          Components: Observability/Metrics
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>         Attachments: async_profiler_cpu_profiles.zip, 
> cpu_profile_insert.html, jmh-result.json
>
>
> Cassandra has a lot of metrics collected, many of them are collected per 
> table, so their instance number is multiplied by number of tables. From one 
> side it gives a better observability, from another side metrics are not for 
> free, there is an overhead associated with them:
> 1) CPU overhead: in case of simple CPU bound load: I already see like 5.5% of 
> total CPU spent for metrics in cpu framegraphs for read load and 11% for 
> write load. 
> Example: [^cpu_profile_insert.html] (search by "codahale" pattern). The 
> framegraph is captured using Async profiler build: 
> async-profiler-3.0-29ee888-linux-x64
> 2) memory overhead: we spend memory for entities used to aggregate metrics 
> such as LongAdders and reservoirs + for MBeans (String concatenation within 
> object names is a major cause of it, for each table+metric name combination a 
> new String is created)
>  
> The idea of this ticket is to allow an operator to configure a list of 
> disabled metrics in cassandra.yaml, like:
> {code:java}
> disabled_metrics:
>     - metric_a
>     - metric_b
> {code}
> From implementation point of view I see two possible approaches (which can be 
> combined):
>  # Generic: when a metric is registering if it is listed in disabled_metrics 
> we do not publish it via JMX and provide a noop implementation of metric 
> object (such as histogram) for it.
> Logging analogy: log level check within log method
>  # Specialized: for some metrics the process of value calculation is not for 
> free and introduces an overhead as well, in such cases it would be useful to 
> check within specific logic using an API (like: isMetricEnabled) do we need 
> to do it. Example of such metric: 
> ClientRequestSizeMetrics.recordRowAndColumnCountMetrics
> Logging analogy: an explicit 'if (isDebugEnabled())' condition used when a 
> message parameter is expensive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to