[ https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419006#comment-13419006 ]
Sylvain Lebresne commented on CASSANDRA-3855: --------------------------------------------- Agreed that it is wrong, but I think that it's more than the first line that is wrong. I think that method should be: {noformat} public boolean hasIrrelevantData(int gcBefore) { if (deletionInfo().isLive()) return false; // Do we have gcable deletion infos? if (!deletionInfo().purge(gcbefore).equals(deletionInfo())) return true; // Do we have colums that are either deleted by the container or gcable tombstone? for (IColumn column : columns) if (deletionInfo().isDeleteted(column) || column.hasIrrelevantData(gcBefore)) return true; return false; } {noformat} > RemoveDeleted dominates compaction time for large sstable counts > ---------------------------------------------------------------- > > Key: CASSANDRA-3855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3855 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.1.0 > Reporter: Stu Hood > Assignee: Yuki Morishita > Labels: compaction, deletes, leveled > Attachments: with-cleaning-java.hprof.txt > > > With very large numbers of sstables (2000+ generated by a `bin/stress -n > 100,000,000` run with LeveledCompactionStrategy), > PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such > that commenting it out takes compaction throughput from 200KB/s to 12MB/s. > Stack attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira