[ 
https://issues.apache.org/jira/browse/CASSANDRA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990081#comment-12990081
 ] 

Sylvain Lebresne commented on CASSANDRA-2101:
---------------------------------------------

Before commenting on the patch itself, I want to use this ticket to recall that 
counter deletes are intrinsically broken. It has been said already but I'll use 
this comment to explain in more depth how so and keep a trace of this for the 
record.

First, I'll use the following notation:
{noformat}
  c(x, 3)@[4, 2] - for a counter column of name x, value 3, timestamp 4 and 
timestampOfLastDelete 2 (I'll use -1 as the min timestampOfLastDelete).
{noformat}
and
{noformat}
  d(x)@[5] - for a tombstone of name x and timestamp 5
{noformat}

And now suppose that the following inserts are done (in that order):
{noformat}
   c(x, 1)@[1, -1]
   d(x)@[2]
   c(x, 1)@[3, -1]
{noformat}

If these inserts are resolved in that order, everything is fine:
{noformat}
   c(x, 1)@[1, -1]
 + d(x)@[2]
=> d(x)@[2]
 + c(x, 1)@[3, -1]
=> c(x, 1)@[3, 2]
{noformat}

However, some reordering don't work. Namely, if you merge the two counts 
together, before you merge one of the count with the delete:
{noformat}
   c(x, 1)@[1, -1]
 + c(x, 1)@[3, -1]
=> c(x, 2)@[3, -1]
 + d(x)@[2]
=> c(x, 2)@[3, 2]
{noformat}

The problem is, the resolve operation is not commutative when you consider 
counter columns and tombstones. But Cassandra rely heavily on resolve being 
commutative (as a side note, I never understood the reason of the 
CommutativeType terminology in the code. It suggest that regular columns are 
not commutative, while they are as far as resolve is concerned. Resolve is not 
idempotent on counters however).

Not only is there no guarantee on which order the insert will be received by 
each node, but even if they are in the right order, there is no guarantee that 
(minor) compaction won't screw up this.

Hence I think that there is not much guarantee we can give on deletes. The only 
one I can think of is that when on issue a delete, you must wait to issue any 
following update that the delete have reach all the nodes and all of them have 
been fully compacted.

That being said, we can keep counter deletes. It's at least useful for cases 
where you know that you won't reuse a counter ever and want to get rid of the 
disk space. But I would add a very strong warning to its documentation.

Lastly, the deletion of full counter rows or super columns suffers the same 
problem for the same reason.


> support deletes in counters
> ---------------------------
>
>                 Key: CASSANDRA-2101
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2101
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.8
>
>         Attachments: 
> 0001-CASSANDRA-2101-fix-timestampOfLastDelete-reconciliat.patch
>
>
> Obey timestampOfLastDelete during reconciliation.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to