[ 
https://issues.apache.org/jira/browse/CASSANDRA-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404265#comment-13404265
 ] 

Jonathan Ellis commented on CASSANDRA-4396:
-------------------------------------------

I'm okay with not calling removeDeleted on flush, in fact I think it's probably 
the right tradeoff given that the extra overhead will be a no-op most of the 
time, but compaction should definitely evict it.
                
> Subcolumns not removed when compacting tombstoned super column
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-4396
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4396
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Jonathan Ellis
>             Fix For: 1.0.11, 1.1.3
>
>
> When we compact a tombstone for a super column with the old data for that 
> super column, we end up writing the deleted super column and all the 
> subcolumn data that is now worthless to the new sstable. This is especially 
> inefficient when reads need to scan tombstones during a slice.
> Here is the output of a simple test I ran to confirm:
> insert supercolumn, then flush
> {noformat}
> Nicks-MacBook-Pro:12:20:52 cassandra-1.0] cassandra$ bin/sstable2json 
> ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-1-Data.db 
> {
> "6b657931": {"supercol1": {"deletedAt": -9223372036854775808, "subColumns": 
> [["737562636f6c31","7468697320697320612074657374",1340990212532000]]}}
> }
> {noformat}
> delete supercolumn, flush again
> {noformat}
> [Nicks-MacBook-Pro:12:20:59 cassandra-1.0] cassandra$ bin/nodetool -h 
> localhost flush
> [Nicks-MacBook-Pro:12:22:41 cassandra-1.0] cassandra$ bin/sstable2json 
> ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-2-Data.db 
> {
> "6b657931": {"supercol1": {"deletedAt": 1340990544005000, "subColumns": []}}
> }
> {noformat}
> compact and check resulting sstable
> {noformat}
> [Nicks-MacBook-Pro:12:22:55 cassandra-1.0] cassandra$ bin/nodetool -h 
> localhost compact 
> [Nicks-MacBook-Pro:12:23:09 cassandra-1.0] cassandra$ bin/sstable2json 
> ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-3-Data.db 
> {
> "6b657931": {"supercol1": {"deletedAt": 1340990544005000, "subColumns": 
> [["737562636f6c31","7468697320697320612074657374",1340990212532000]]}}
> }
> [Nicks-MacBook-Pro:12:23:20 cassandra-1.0] cassandra$ 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to