[ https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709261#comment-15709261 ]
Blake Eggleston commented on CASSANDRA-9143: -------------------------------------------- bq. Should we prioritize the pending-repair-cleanup compactions? Makes sense. bq. Is there any point in doing anticompaction after repair with -full repairs? Can we always do consistent repairs? We would need to anticompact already repaired sstables into pending, but that should not be a big problem? Good point. I'd say we should keep full repairs simple. Don't do anti-compaction on them, and don't make them consistent. Given the newness and relative complexity of consistent repair, it would be smart to have a full workaround in case we find a problem with it. If we're not going to do anti-compaction though, we should preserve repairedAt values of the sstables we're streaming around as part of a full repair. That will make is possible to fix corrupted or lost data in the repair buckets without adversely affecting the next incremental repair. bq. In handleStatusRequest - if we don't have the local session, we should probably return that the session is failed? That makes sense > Improving consistency of repairAt field across replicas > -------------------------------------------------------- > > Key: CASSANDRA-9143 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9143 > Project: Cassandra > Issue Type: Improvement > Reporter: sankalp kohli > Assignee: Blake Eggleston > > We currently send an anticompaction request to all replicas. During this, a > node will split stables and mark the appropriate ones repaired. > The problem is that this could fail on some replicas due to many reasons > leading to problems in the next repair. > This is what I am suggesting to improve it. > 1) Send anticompaction request to all replicas. This can be done at session > level. > 2) During anticompaction, stables are split but not marked repaired. > 3) When we get positive ack from all replicas, coordinator will send another > message called markRepaired. > 4) On getting this message, replicas will mark the appropriate stables as > repaired. > This will reduce the window of failure. We can also think of "hinting" > markRepaired message if required. > Also the stables which are streaming can be marked as repaired like it is > done now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)