[jira] [Commented] (CASSANDRA-9143) Improving consistency of repairAt field across replicas

Paulo Motta (JIRA) Fri, 26 Aug 2016 15:30:37 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440059#comment-15440059
 ]


Paulo Motta commented on CASSANDRA-9143:
----------------------------------------

bq. Both are really manifestations of the same root problem: incremental repair 
behaves unpredictably because data being repaired isn't kept separate from 
unrepaired data during repair. Maybe we should expand the problem description, 
and close CASSANDRA-8858 as a dupe?

Thanks for clarifying, we should definitely update the title and description 
since a more general problem is being tackled here from the one originally 
stated on the ticket. I agree we should close CASSANDRA-8858 since that will be 
superseded by this.

bq. We’d have to be optimistic and anti-compact all the tables and ranges we’re 
going to be repairing prior to validation. Obviously, failed ranges would have 
to be re-anticompacted back into unrepaired. The cost of this would have to be 
compared to the higher network io caused by the current state of things, and 
the frequency of failed ranges.

I think that's a good idea and could also help mitigate repair impact on vnodes 
due to multiple flushes to run validations for every vnode (CASSANDRA-9491, 
CASSANDRA-10862), since we would only validate the anti-compacted sstables from 
the beginning of the parent repair session. On the other hand, we should think 
carefully about how sstables in the pending repair bucket will be handled, 
since holding compaction of these sstables for a long time could lead to poor 
read performance and extra compaction I/O after repair

For frequently running incremental repair this shouldn't be a problem since 
repairs should be fast, but if many unrepaired sstables pile up (or in the case 
of full repairs), then this could become a problem. One approach would be to 
skip upfront anti-compaction if unrepaired set is above some size treshold (or 
full repairs) and fall back to anti-compaction at the end as done now. Also, 
there sould probably be some safety mechanism (timeout, etc) that releases 
sstables from the pending repair bucket if they're there for a long time as 
Marcus suggested on CASSANDRA-5351.

> Improving consistency of repairAt field across replicas 
> --------------------------------------------------------
>
>                 Key: CASSANDRA-9143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9143
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Blake Eggleston
>            Priority: Minor
>
> We currently send an anticompaction request to all replicas. During this, a 
> node will split stables and mark the appropriate ones repaired. 
> The problem is that this could fail on some replicas due to many reasons 
> leading to problems in the next repair. 
> This is what I am suggesting to improve it. 
> 1) Send anticompaction request to all replicas. This can be done at session 
> level. 
> 2) During anticompaction, stables are split but not marked repaired. 
> 3) When we get positive ack from all replicas, coordinator will send another 
> message called markRepaired. 
> 4) On getting this message, replicas will mark the appropriate stables as 
> repaired. 
> This will reduce the window of failure. We can also think of "hinting" 
> markRepaired message if required. 
> Also the stables which are streaming can be marked as repaired like it is 
> done now. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9143) Improving consistency of repairAt field across replicas

Reply via email to