Thanks Robert!!
The JIRA was very helpful in understanding how tombstone threshold is implemented. And ticket also says that running major compaction weekly is an alternative. I actually want to understand if I run major compaction on a cf with 500gb of data and a single giant file is created. Do you see any problems with Cassandra processing such a huge file? Is there any Max sstable size beyond which performance etc degrades? What are the implications? Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:"Robert Coli" <rc...@eventbrite.com> Date:Fri, 17 Apr, 2015 at 10:55 pm Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists On Tue, Apr 14, 2015 at 8:29 PM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote: By automatic tombstone compaction, I am referring to tombstone_threshold sub property under compaction strategy in CQL. It is 0.2 by default. So what I understand from the Datastax documentation is that even if a sstable does not find sstables of similar size (STCS) , an automatic tombstone compaction will trigger on sstable when 20% data is tombstone. This compaction works on single sstable only. Overall system behavior is discussed here : https://issues.apache.org/jira/browse/CASSANDRA-6654?focusedCommentId=13914587&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13914587 They are talking about LCS, but the principles apply, but with an overlay of how STS behaves. =Rob