On Tue, Oct 22, 2013 at 5:17 PM, java8964 java8964 <[email protected]>wrote:
> Any way I can verify how often the system being "repaired"? I can ask > another group who maintain the Cassandra cluster. But do you mean that even > the failed writes will be stored in the SSTable files? > "repair" sessions are logged in system.log, and the "best practice" is to run a repair once every gc_grace_seconds, which defaults to 10 days. A "failed" write means only that it "failed" to meet its ConsistencyLevel in the request_timeout. It does not mean that it failed to write everywhere it tried to write. There is no rollback, so in practice with RF>1 it is likely that a "failed" write succeeded at least somewhere. But if any failure is noted, Cassandra will generate a hint for hinted handoff and attempt to redeliver the "failed" write. Also, many/most client applications will respond to a timedoutexception by attempting to re-write the "failed" write, using the same client timestamp. Repair has a fixed granularity, so the larger the size of your dataset the more "over-repair" any given "repair" will cause. Duplicates occur as a natural consequences of this, if you have 1 row which differs in the merkle tree chunk and the merkle tree chunk is, for example, 1000 rows.. you will "repair" one row and "duplicate" the other 999. =Rob
