[jira] [Commented] (CASSANDRA-14763) Fail incremental repair prepare phase if it encounters sstables from un-finalized sessions

Blake Eggleston (JIRA) Wed, 19 Sep 2018 10:33:22 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620926#comment-16620926
 ]


Blake Eggleston commented on CASSANDRA-14763:
---------------------------------------------

Pushed up fixes and started a new circle run.

Good catch with the race condition. I realized there’s another race where 
another pending anti-compaction could complete after we’d filtered our sstables 
and before we locked the sstables, potentially leaking data from one session to 
another which I fixed in AcquisitionCallback.

> Fail incremental repair prepare phase if it encounters sstables from 
> un-finalized sessions
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14763
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14763
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Repair
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Major
>             Fix For: 4.0
>
>
> Raised in CASSANDRA-14685. If we encounter sstables from other IR sessions 
> during an IR prepare phase, we should fail the new session. If we don't, the 
> expectation that all data received before a repair session is consistent when 
> it completes wouldn't always be true.
> In more detail: 
> We don’t have a foolproof way of determining if a repair session has hung. To 
> prevent hung repair sessions from locking up sstables indefinitely, 
> incremental repair sessions will auto-fail after 24 hours. During this time, 
> the sstables for this session will remain isolated from the rest of the data 
> set. Afterwards, the sstables are moved back into the unrepaired set.
>  
> During the prepare phase of an incremental repair, we isolate the data to be 
> repaired. However, we ignore other sstables marked pending repair for the 
> same token range. I think the intention here was to prevent a hung repair 
> from locking up incremental repairs for 24 hours without manual intervention. 
> Assuming the session succeeds, it’s data will be moved to repaired. _However 
> the data from a hung session will eventually be moved back to unrepaired._ 
> This means that you can’t use the most recent successful incremental repair 
> as the high water mark for fully repaired data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14763) Fail incremental repair prepare phase if it encounters sstables from un-finalized sessions

Reply via email to