[ https://issues.apache.org/jira/browse/CASSANDRA-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562449#comment-16562449 ]
Joseph Lynch commented on CASSANDRA-14605: ------------------------------------------ [~krummas] I've set the parameter and taken another flamegraph, all of the {{moveStarts}} usage is gone (nice!) and it appears to be proceeding about twice as fast (instead of taking 4 days it will take 2 days). I've attached the new flamegraph which looks much more reasonable, although now I notice the 50% spent in {{SSTableReader::getCachedPosition}} which I believe is just invalidating the key cache, and in particular {{ConcurrentLinkedHashMap}} appears to be pretty slow for high miss rate queries. There may be some low hanging fruit performance improvements to have there as well. [^sstable_reopen.svg] > Major compaction of LCS tables very slow > ---------------------------------------- > > Key: CASSANDRA-14605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14605 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Environment: AWS, i3.4xlarge instance (very fast local nvme storage), > Linux 4.13 > Cassandra 3.0.16 > Reporter: Joseph Lynch > Assignee: Benedict > Priority: Minor > Labels: lcs, performance > Attachments: slow_major_compaction_lcs.svg, sstable_reopen.svg > > > We've recently started deploying 3.0.16 more heavily in production and today > I noticed that full compaction of LCS tables takes a much longer time than it > should. In particular it appears to be faster to convert a large dataset to > STCS, run full compaction, and then convert it to LCS (with re-leveling) than > it is to just run full compaction on LCS (with re-leveling). > I was able to get a CPU flame graph showing 50% of the major compaction's cpu > time being spent in > [{{SSTableRewriter::maybeReopenEarly}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L184] > calling > [{{SSTableRewriter::moveStarts}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L223]. > I've attached the flame graph here which was generated by running Cassandra > using {{-XX:+PreserveFramePointer}}, then using jstack to get the compaction > native thread id (nid) which I then used perf to get on cpu time: > {noformat} > perf record -t <compaction thread> -o <output file> -F 49 -g sleep 60 > >/dev/null > {noformat} > I took this data and collapsed it using the steps talked about in [Brendan > Gregg's java in flames > blogpost|https://medium.com/netflix-techblog/java-in-flames-e763b3d32166] > (Instructions section) to generate the graph. > The results are that at least on this dataset (700GB of data compressed, > 2.2TB uncompressed), we are spending 50% of our cpu time in {{moveStarts}} > and I am unsure that we need to be doing that as frequently as we are. I'll > see if I can come up with a clean reproduction to confirm if it's a general > problem or just on this particular dataset. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org