John Sumsion created CASSANDRA-8169:
---------------------------------------

             Summary: Background bitrot detector to avoid client exposure
                 Key: CASSANDRA-8169
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8169
             Project: Cassandra
          Issue Type: New Feature
            Reporter: John Sumsion


With a lot of static data sitting in SSTables, and with only a relatively small 
add/edit rate, incremental repair sounds very good.  However, there is one 
significant cost to switching away from full repair.

If/when bitrot corrupts an SSTable, there is nothing standing between a user 
query and a corruption/failure-response event except for the other replicas.  
This combined with a rolling restart or upgrade can make a token range 
non-writable via quorum CL.

While you could argue that full repairs should be scheduled on a longer-term 
regular basis, I don't really care about all the repair overhead, I just want 
something that can run ahead of user queries whose only responsibility is to 
detect bitrot, so that I can replace nodes in an aggressive way instead of 
having it be a failure-response situation.

This bitrot detector need not incur the full cross-cluster cost of repair, and 
so would be less of a burden to run periodically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to