[ https://issues.apache.org/jira/browse/CASSANDRA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126455#comment-13126455 ]
Sylvain Lebresne commented on CASSANDRA-3354: --------------------------------------------- No, Expiring columns don't need two compactions to get gc'ed. The conversion of expiring column to tombstone is only a space optimization (to potentially gain the space of the column value during the usually fairly long gc_grace period), but it changes nothing to when the column is gc'ed. So, an expired expiring column is gc'ed as soon as it can, in one shot (I just tried it to be sure and it works). Now, as for what you are seeing, I'm not sure what it is but here's some thinks to check: * for a expired column to be gc'ed during a compaction, it needs to be gcable at the *start* of the compaction (same for tombstone actually). That could make a difference on long running compaction (and yes, we could probably improve that but I doubt this has a big impact in practice). * related to the previous, expiring columns are converted to tombstone at read time. This is true for the reads done by sstable2json in particular. This means that when sstable2json shows you a tombstone, it could be that inside the sstable, it's actually an expired column and it turns out that this column was not expired yet at the time of the compaction. * only major compactions are guaranteed to gc all tombstones. Though if you've used 'nodetool compact' then you've triggered a major one. > tombstone not removed after compaction > -------------------------------------- > > Key: CASSANDRA-3354 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3354 > Project: Cassandra > Issue Type: Bug > Reporter: Yang Yang > Assignee: Sylvain Lebresne > Priority: Minor > > I set GC_grace to 2 hours, for testing. > then I compacted the sstables using nodecmd, > but the resulting sstables contained many Deletion records older than 2 hours > "0000000000000d5e3263303666346331000000000000000100000000": > [["00000132f8820139303030303030303030303030303030303030303030303030303030303030303030303263303666346332","4e95a659",1318429297125,"d"]], > yyang@ip-10-71-86-162:~/src/svn/whisky$ perl -e 'print > gmtime(1318429297)."\n" ' > Wed Oct 12 14:21:37 2011 > -rw-r--r-- 1 yyang yyang 381366163 2011-10-12 16:39 > /mnt/cass/lib/cassandra/data/testBudget_items/multi_click_filter-h-511-Data.db > but it seems that after running a few more compactions, these records are gone -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira