[ https://issues.apache.org/jira/browse/CASSANDRA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908792#action_12908792 ]
Jonathan Ellis commented on CASSANDRA-1074: ------------------------------------------- committed w/ minor changes. happy to merge backport to 0.6 as well. Thanks Sylvain! > check bloom filters to make minor compaction able to delete (some) tombstones > ----------------------------------------------------------------------------- > > Key: CASSANDRA-1074 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1074 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Robert Coli > Assignee: Sylvain Lebresne > Fix For: 0.7 beta 2 > > Attachments: > 0001-Purge-tombstone-on-minor-compaction-after-gc_grace_p.patch > > > Given a tombstoned key which is older than GCGraceSeconds, current (0.6.1) > compaction implementation still requires a major compaction for the key to > actually be deleted. The major compaction is required is because we must know > whether there is a version of the key inside all SSTables associated with the > columnfamily, including ones not involved in minor compactions. However, as > we have bloom filters into each one of these SSTables, minor compaction can > relatively inexpensively check for existence of this key in SSTable files not > involved in the current minor compaction, and thereby delete the key, > assuming all bloom filters return negative. If the filter returns positive, a > major compaction would of course still be required. > For use cases like CASSANDRA-1041 where users are strongly motivated to not > do a major compaction, this seems to answer the jbellis objection : > "You don't want to skip large files in major compactions, since the > definition of major is "compact everything so it is safe to remove > tombstones." " > The above described improvement appears to provide "safe to remove (some) > tombstones" without requiring "compact everything", and so may be a useful > optimization. > =Rob -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.