[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415923#comment-16415923
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--------------------------------------------------

For the record, the same tests on 3.11.2 didn't show any notable performance 
difference between debug on and off : 

Cassandra 3.11.2 debug on : 
{noformat}
Results:
Op rate : 18 777 op/s [read_event_1: 3 165 op/s, read_event_2: 3 109 op/s, 
read_event_3: 12 562 op/s]
Partition rate : 6 215 pk/s [read_event_1: 3 165 pk/s, read_event_2: 3 109 
pk/s, read_event_3: 0 pk/s]
Row rate : 6 215 row/s [read_event_1: 3 165 row/s, read_event_2: 3 109 row/s, 
read_event_3: 0 row/s]
Latency mean : 6,7 ms [read_event_1: 6,7 ms, read_event_2: 6,7 ms, 
read_event_3: 6,6 ms]
Latency median : 5,0 ms [read_event_1: 5,0 ms, read_event_2: 5,0 ms, 
read_event_3: 4,9 ms]
Latency 95th percentile : 15,6 ms [read_event_1: 15,5 ms, read_event_2: 15,9 
ms, read_event_3: 15,5 ms]
Latency 99th percentile : 43,3 ms [read_event_1: 42,7 ms, read_event_2: 44,2 
ms, read_event_3: 43,2 ms]
Latency 99.9th percentile : 82,0 ms [read_event_1: 80,3 ms, read_event_2: 82,4 
ms, read_event_3: 82,1 ms]
Latency max : 272,4 ms [read_event_1: 272,4 ms, read_event_2: 268,7 ms, 
read_event_3: 245,1 ms]
Total partitions : 330 970 [read_event_1: 165 386, read_event_2: 165 584, 
read_event_3: 0]
Total errors : 0 [read_event_1: 0, read_event_2: 0, read_event_3: 0]
Total GC count : 42
Total GC memory : 13,102 GiB
Total GC time : 1,8 seconds
Avg GC time : 42,4 ms
StdDev GC time : 1,3 ms
Total operation time : 00:00:53{noformat}
 


Cassandra 3.11.2 debug off : 
{noformat}
Results:
Op rate : 18 853 op/s [read_event_1: 3 138 op/s, read_event_2: 3 137 op/s, 
read_event_3: 12 578 op/s]
Partition rate : 6 275 pk/s [read_event_1: 3 138 pk/s, read_event_2: 3 137 
pk/s, read_event_3: 0 pk/s]
Row rate : 6 275 row/s [read_event_1: 3 138 row/s, read_event_2: 3 137 row/s, 
read_event_3: 0 row/s]
Latency mean : 6,7 ms [read_event_1: 6,7 ms, read_event_2: 6,7 ms, 
read_event_3: 6,7 ms]
Latency median : 5,0 ms [read_event_1: 5,1 ms, read_event_2: 5,1 ms, 
read_event_3: 5,0 ms]
Latency 95th percentile : 15,5 ms [read_event_1: 15,5 ms, read_event_2: 15,6 
ms, read_event_3: 15,4 ms]
Latency 99th percentile : 39,9 ms [read_event_1: 41,0 ms, read_event_2: 39,6 
ms, read_event_3: 39,6 ms]
Latency 99.9th percentile : 73,3 ms [read_event_1: 73,4 ms, read_event_2: 71,6 
ms, read_event_3: 73,6 ms]
Latency max : 367,0 ms [read_event_1: 240,5 ms, read_event_2: 250,3 ms, 
read_event_3: 367,0 ms]
Total partitions : 332 852 [read_event_1: 166 447, read_event_2: 166 405, 
read_event_3: 0]
Total errors : 0 [read_event_1: 0, read_event_2: 0, read_event_3: 0]
Total GC count : 46
Total GC memory : 14,024 GiB
Total GC time : 2,0 seconds
Avg GC time : 42,7 ms
StdDev GC time : 3,9 ms
Total operation time : 00:00:53{noformat}
The improvement over 2.2 is nice though :)

 

> Debug logging can create massive performance issues
> ---------------------------------------------------
>
>                 Key: CASSANDRA-14318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alexander Dejanovski
>            Assignee: Alexander Dejanovski
>            Priority: Major
>              Labels: lhf, performance
>             Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>         Attachments: cassandra-2.2-debug.yaml, debuglogging.png, flame22 
> nodebug sjk svg.png, flame22-nodebug-sjk.svg, flame22-sjk.svg, 
> flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to