Re: Switching to Incremental Repair

2024-02-07 Thread Bowen Song via user
The over-streaming is only problematic for the repaired SSTables, but it can be triggered by inconsistencies within the unrepaired SSTables during an incremental repair session. This is because although an incremental repair will only compare the unrepaired SSTables, but it will stream both

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
Thank you very much for your explanation. Streaming happens on the token range level, not the SSTable level, right? So, when running an incremental repair before the full repair, the problem that “some unrepaired SSTables are being marked as repaired on one node but not on another” should not

Re: Switching to Incremental Repair

2024-02-07 Thread Bowen Song via user
Unfortunately repair doesn't compare each partition individually. Instead, it groups multiple partitions together and calculate a hash of them, stores the hash in a leaf of a merkle tree, and then compares the merkle trees between replicas during a repair session. If any one of the partitions

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> Caution, using the method you described, the amount of data streamed at the > end with the full repair is not the amount of data written between stopping > the first node and the last node, but depends on the table size, the number > of partitions written, their distribution in the ring and

Re: Switching to Incremental Repair

2024-02-07 Thread Bowen Song via user
Caution, using the method you described, the amount of data streamed at the end with the full repair is not the amount of data written between stopping the first node and the last node, but depends on the table size, the number of partitions written, their distribution in the ring and the

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> That's a feature we need to implement in Reaper. I think disallowing the > start of the new incremental repair would be easier to manage than pausing > the full repair that's already running. It's also what I think I'd expect as > a user. > > I'll create an issue to track this. Thank you,

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> Full repair running for an entire week sounds excessively long. Even if > you've got 1 TB of data per node, 1 week means the repair speed is less than > 2 MB/s, that's very slow. Perhaps you should focus on finding the bottleneck > of the full repair speed and work on that instead. We store

Re: Switching to Incremental Repair

2024-02-07 Thread Bowen Song via user
Just one more thing. Make sure you run 'nodetool repair -full' instead of just 'nodetool repair'. That's because the command's default was changed in Cassandra 2.x. The default was full repair before that change, but the new default now is incremental repair. On 07/02/2024 10:28, Bowen Song

Re: Switching to Incremental Repair

2024-02-07 Thread Bowen Song via user
Not disabling auto-compaction may result in repaired SSTables getting compacted together with unrepaired SSTables before the repair state is set on them, which leads to mismatch in the repaired data between nodes, and potentially very expensive over-streaming in a future full repair. You