John Sumsion created CASSANDRA-8169: ---------------------------------------
Summary: Background bitrot detector to avoid client exposure Key: CASSANDRA-8169 URL: https://issues.apache.org/jira/browse/CASSANDRA-8169 Project: Cassandra Issue Type: New Feature Reporter: John Sumsion With a lot of static data sitting in SSTables, and with only a relatively small add/edit rate, incremental repair sounds very good. However, there is one significant cost to switching away from full repair. If/when bitrot corrupts an SSTable, there is nothing standing between a user query and a corruption/failure-response event except for the other replicas. This combined with a rolling restart or upgrade can make a token range non-writable via quorum CL. While you could argue that full repairs should be scheduled on a longer-term regular basis, I don't really care about all the repair overhead, I just want something that can run ahead of user queries whose only responsibility is to detect bitrot, so that I can replace nodes in an aggressive way instead of having it be a failure-response situation. This bitrot detector need not incur the full cross-cluster cost of repair, and so would be less of a burden to run periodically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)