[ 
https://issues.apache.org/jira/browse/CASSANDRA-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047475#comment-13047475
 ] 

Jonathan Ellis commented on CASSANDRA-2589:
-------------------------------------------

removeDeletedColumnsOnly has this behavior:
{noformat}
            // remove columns if
            // (a) the column itself is tombstoned or
            // (b) the CF is tombstoned and the column is not newer than it
{noformat}

(a) is a problem here, because we DO want to preserve column tombstones (unless 
it is shadowed by a newer CF-level tombstone).

> row deletes do not remove columns
> ---------------------------------
>
>                 Key: CASSANDRA-2589
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2589
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>            Priority: Minor
>             Fix For: 0.8.1
>
>         Attachments: 
> 0001-remove-deleted-columns-before-flushing-memtable-v07.patch, 
> 0001-remove-deleted-columns-before-flushing-memtable-v08.patch
>
>
> When a row delete is issued CF.delete() sets the localDeletetionTime and 
> markedForDeleteAt values but does not remove columns which have a lower time 
> stamp. As a result:
> # Memory which could be freed is held on to (prob not too bad as it's already 
> counted)
> # The deleted columns are serialised to disk, along with the CF info to say 
> they are no longer valid. 
> # NamesQueryFilter and SliceQueryFilter have to do more work as they filter 
> out the irrelevant columns using QueryFilter.isRelevant()
> # Also columns written with a lower time stamp after the deletion are added 
> to the CF without checking markedForDeletionAt.
> This can cause RR to fail, will create another ticket for that and link. This 
> ticket is for a fix to removing the columns. 
> Two options I could think of:
> # Check for deletion when serialising to SSTable and ignore columns if the 
> have a lower timestamp. Otherwise leave as is so dead columns stay in memory. 
> # Ensure at all times if the CF is deleted all columns it contains have a 
> higher timestamp. 
> ## I *think* this would include all column types (DeletedColumn as well) as 
> the CF deletion has the same effect. But not sure.
> ## Deleting (potentially) all columns in delete() will take time. Could track 
> the highest timestamp in the CF so the normal case of deleting all cols does 
> not need to iterate. 
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to