[jira] [Commented] (CASSANDRA-2759) Scrub could lose increments and replicate that loss

Sylvain Lebresne (JIRA) Fri, 10 Jun 2011 09:56:54 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047285#comment-13047285
 ]


Sylvain Lebresne commented on CASSANDRA-2759:
---------------------------------------------

It's picking a new UUID for the current node to use for new counter increment.

The problem is that on a given node we store deltas for it's current nodeId (to 
avoid synchronized read-before-write, but I'm starting to wonder is that was 
the smartest ever). Anyway, if scrub skips a row, it may skip some of those 
deltas. Let's say at first there is no increments coming for this row for A as 
'first distinguished replica'. So far we are still kind of good, because on a 
read (with CL > ONE) the result coming from A will have a 'version' for it's 
own sub-count smaller that the one on the other replica, so we will us the 
sub-count on those replica and return the correct value.

However, as soon as A acknowledge new increments for this row, it will start 
inserting new deltas while he is not intrinsically up to date. Which will 
result in an definitive undercount.

The goal of renewing the node id of A is to make sure that second part never 
happen (because after the renew A will add new deltas as A', not A anymore).

Anyway, now that I've plugged the brain this patch doesn't really works because 
A will never be repaired by the other nodes of it's now inconsistent value.

So I have no clue how to actually fix that.

> Scrub could lose increments and replicate that loss
> ---------------------------------------------------
>
>                 Key: CASSANDRA-2759
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2759
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>              Labels: counters
>             Fix For: 0.8.1
>
>         Attachments: 0001-Renew-nodeId-in-scrub-when-skipping-rows.patch
>
>
> If scrub cannot 'repair' a corrupted row, it will skip it. On node A, if the 
> row contains some sub-count for A id, those will be lost forever since A is 
> the source of truth on it's current id. We should thus renew node A id when 
> that happens to avoid this (not unlike we do in cleanup).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2759) Scrub could lose increments and replicate that loss

Reply via email to