[ 
https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176876#comment-17176876
 ] 

Robert Stupp edited comment on CASSANDRA-15810 at 8/13/20, 9:36 AM:
--------------------------------------------------------------------

Yea - I think, the biggest "consumer" is probably the list of tokens, which is 
also quite unique per node. So no savings for that one.
Potential saving might happen for "release-version", "dc", "rack" and current 
"schema-version" - not particularly long strings so I doubt that it's actually  
worth the {{String.intern()}} as the total savings are rather low. The other 
values of {{VersionedValue}} are unique per peer and put unnecessary pressure 
on the string-intern-map.

Other usages of {{String.intern()}} in the whole production code (i.e. 
including JDK and libraries) are (intentionally) not on a hot-path.

This change feels IMO safe for 4.0 and also safe to be backported to 3.11 + 3.0.


was (Author: snazy):
Yea - I think, the biggest "consumer" is probably the list of tokens, which is 
also quite unique per node. So no savings for that one.
Potential saving might happen for "release-version", "dc", "rack" and current 
"schema-version" - not particularly long strings so I doubt that it's actually 
not worth the {{String.intern()}} as the total savings are rather low. The 
other values of {{VersionedValue}} are unique per peer and put unnecessary 
pressure on the string-intern-map.

Other usages of {{String.intern()}} in the whole production code (i.e. 
including JDK and libraries) are (intentionally) not on a hot-path.

This change feels IMO safe for 4.0 and also safe to be backported to 3.11 + 3.0.

> Default StringTableSize parameter causes GC slowdown
> ----------------------------------------------------
>
>                 Key: CASSANDRA-15810
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15810
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Config
>            Reporter: Tom van der Woerdt
>            Priority: Normal
>              Labels: gc, performance
>
> While looking at tail latency on a Cassandra cluster, it came up that the 
> default StringTableSize in Cassandra is set to a million:
> {code:java}
> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
> -XX:StringTableSize=1000003{code}
> This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage 
> on a test case, running with 500 nodes and num_tokens=512.
> Until Java 13, this string table is implemented as native code, and has to be 
> traversed entirely during the GC initial marking phase, which is a STW event.
> Some testing on my end shows that the pause time of a GC cycle can be reduced 
> by approximately 10 milliseconds if we lower the string table size back to 
> the Java 8 default of 60013 entries.
> Thus, I would recommend this patch (3.11 branch, similar patch for 4.0):
> {code:java}
> diff --git a/conf/jvm.options b/conf/jvm.options
> index 01bb1685b3..c184d18c5d 100644
> --- a/conf/jvm.options
> +++ b/conf/jvm.options
> @@ -107,9 +107,6 @@
>  # Per-thread stack size.
>  -Xss256k
> -# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
> --XX:StringTableSize=1000003
> -
>  # Make sure all memory is faulted and zeroed on startup.
>  # This helps prevent soft faults in containers and makes
>  # transparent hugepage allocation more effective.
>  {code}
> It does need some testing on more extreme clusters than I have access to, but 
> I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which 
> suggested that the Java default will suffice here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to