[ https://issues.apache.org/jira/browse/CASSANDRA-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185727#comment-14185727 ]
Robert Coli commented on CASSANDRA-8169: ---------------------------------------- My belief is that marking-things-unrepaired-in-the-bitrot-case is critically important for incremental repair to be safe. In order for us to provide the guarantee that "marked repaired" asserts, we must mark all replicas for this data unrepaired immediately when we detect that this replica is corrupt due to bitrot. I do not believe an offline tool a-la CASSANDRA-5791 is capable of accomplishing this goal; I believe only marking tables unrepaired as a result of a CRC on both the compressed and uncompressed read path is the only opportunity we have to accomplish it. Unfortunately there is currently no CRC on the uncompressed read path, so incremental repair is probably trivially provably unsafe there : one doesn't even realize the bitrot has occurred unless the SSTable is now corrupt in structure. If this is not the ticket for this note about bitrot vs. "marked repaired" I am glad to open a ticket on the specific concern I describe. I would hate the great work of incremental repair to get a bad name when someone illustrates this potentially serious edge case. > Background bitrot detector to avoid client exposure > --------------------------------------------------- > > Key: CASSANDRA-8169 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8169 > Project: Cassandra > Issue Type: New Feature > Reporter: John Sumsion > > With a lot of static data sitting in SSTables, and with only a relatively > small add/edit rate, incremental repair sounds very good. However, there is > one significant cost to switching away from full repair. > If/when bitrot corrupts an SSTable, there is nothing standing between a user > query and a corruption/failure-response event except for the other replicas. > This combined with a rolling restart or upgrade can make a token range > non-writable via quorum CL. > While you could argue that full repairs should be scheduled on a longer-term > regular basis, I don't really care about all the repair overhead, I just want > something that can run ahead of user queries whose only responsibility is to > detect bitrot, so that I can replace nodes in an aggressive way instead of > having it be a failure-response situation. > This bitrot detector need not incur the full cross-cluster cost of repair, > and so would be less of a burden to run periodically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)