[ https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649177#comment-15649177 ]
Blake Eggleston commented on CASSANDRA-9143: -------------------------------------------- | [trunk|https://github.com/bdeggleston/cassandra/tree/9143-trunk] | [dtest|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-9143-trunk-dtest/] | [testall|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-9143-trunk-testall/] | | [3.0|https://github.com/bdeggleston/cassandra/tree/9143-3.0] | [dtest|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-9143-3.0-dtest/] | [testall|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-9143-3.0-testall/]| [dtest branch|https://github.com/bdeggleston/cassandra-dtest/tree/9143] I've tried to break this up into logical commits for each component of the change to make reviewing easier. The new incremental repair would work as follows: # persist session locally on each repair participant # anti-compact all unrepaired sstables intersecting with the range being repaired into a pending repair bucket # perform validation/sync against the sstables segregated in the pending anti compaction step # perform 2PC to promote pending repair sstables into repaired #* If this, or the validation/sync phase fails, the sstables are moved back into unrepaired Since incremental repair is the default in 3.0, I've also included a patch which fixes the consistency problems in 3.0, and is backwards compatible with the existing repair. That said, I'm not really convinced that making a change like this to repair in 3.0.x is a great idea. I'd be more in favor of disabling incremental repair, or at least not making it the default in 3.0.x. The compaction that gets kicked off after streamed sstables are added to the cfs means that whether repaired data is ultimately placed in the repaired or unrepaired bucket by anti-compaction is basically a crapshoot. > Improving consistency of repairAt field across replicas > -------------------------------------------------------- > > Key: CASSANDRA-9143 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9143 > Project: Cassandra > Issue Type: Improvement > Reporter: sankalp kohli > Assignee: Blake Eggleston > Priority: Minor > > We currently send an anticompaction request to all replicas. During this, a > node will split stables and mark the appropriate ones repaired. > The problem is that this could fail on some replicas due to many reasons > leading to problems in the next repair. > This is what I am suggesting to improve it. > 1) Send anticompaction request to all replicas. This can be done at session > level. > 2) During anticompaction, stables are split but not marked repaired. > 3) When we get positive ack from all replicas, coordinator will send another > message called markRepaired. > 4) On getting this message, replicas will mark the appropriate stables as > repaired. > This will reduce the window of failure. We can also think of "hinting" > markRepaired message if required. > Also the stables which are streaming can be marked as repaired like it is > done now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)