I should have clarified we have 3 copies, so in that case as long as 2 match we should be ok?
Even if there were checksumming at the SStable level, I assume it has to check and report these errors on compaction (or node repair)? I have seen some JIRA open on these issues ( 47 and 1717), but if I need something today, a read repair ( or a node repair) is the only viable option? On Mon, Feb 7, 2011 at 12:09 PM, Peter Schuller <peter.schul...@infidyne.com > wrote: > > Our application space is such that there is data that might not be read > for > > a long time. The data is mostly immutable. How should I approach > > detecting/solving the bitrot problem? One approach is read data and let > read > > repair do the detection, but given the size of data, that does not look > very > > efficient. > > Note that read-repair is not really intended to repair arbitrary > corruptions. Unless I'm mistaken, arbitrary corruption, unless it > triggers a serialization failure that causes row skipping, it's a > toss-up which version of the data is retained (or both, if the > corruption is in the key). Given the same key and column timestamp, > the tie breaker is the volumn value. So depending on whether > corruption results in a "lesser" or "greater" value, you might get the > corrupt or non-corrupt data. > > > Has anybody solved/workaround this or has any other suggestions to detect > > and fix bitrot? > > My feel/tentative opinion is that the clean fix is for Cassandra to > support strong checksumming at the sstable level. > > Deploying on e.g. ZFS would help a lot with this, but that's a problem > for deployment on Linux (which is the recommended platform for > Cassandra). > > -- > / Peter Schuller >