Re: Migrating to incremental repair in C* 4.x

Bowen Song via user Mon, 27 Nov 2023 03:31:25 -0800

Hi Sebastian,

It's better to walk down the path on which others have walked before youand had great success, than a path that nobody has ever walked. For theformer, you know it's relatively safe and it works. The same can hardlybe said for the later.

You said it takes a week to run the full repair for your entire cluster,not each node. Depending on the number of nodes in your cluster, eachnode should take significantly less time than that unless you have RFset to the total number of nodes. Keep in mind that you only need todisable the auto-compaction for the duration of a full repair on eachnode, not the whole cluster.

Now, you asked, how do I know is that going to be an issue or not? Thatdepends on a few factors, such as:

* how long does it take for each node to complete a full repair for thatnode* how many SSTables currently exist on each node (try "find/var/lib/cassandra/data -name '*-Data.db' | wc -l")

* how frequently is the memtable getting flushed on each node

* what's the number of open file descriptors limit (see "cat/proc/[PID]/limits" and "sysctl fs.nr_open")

If the total number of SSTables (existing, plus the number of memtableflushes when the auto-compaction is turned off) is going to besignificantly less than half of the number of open FD limit, you'll havenothing to worry about. Otherwise, you may want to consider temporarilyincreasing the open FD limit, reduce the memtable flush frequency (e.g.increase the memtable size or reduce the number of write requests) orreduce the existing number of SSTables (e.g. compaction), or just takethe risk and bet on that Cassandra is not going to open all the SSTablesat the same time (not recommended).

You may be wondering, why only half of the number of open FD limit?That's because Cassandra usually keeps both the *-Index.db and *-Data.dbfiles open when an SSTable is in use.


I hope that helps.

Regards,
Bowen

On 23/11/2023 23:30, Sebastian Marsching wrote:

Hi,
we are currently in the process of migrating from C* 3.11 to C* 4.1and we want to start using incremental repairs after the upgrade hasbeen completed. It seems like all the really bad bugs that made usingincremental repairs dangerous in C* 3.x have been fixed in 4.x, andfor our specific workload, incremental repairs should offer asignificant performance improvement.
Therefore, I am currently devising a plan how we could migrate tousing incremental repairs. I am aware of the guide from DataStax(https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsRepairNodesMigration.html),but this guide is quite old and was written with C* 3.0 in mind, so Iam not sure whether this still fully applies to C* 4.x.
In addition to that, I am not sure whether this approach fits ourworkload. In particular, I am wary about disabling autocompaction foran extended period of time (if you are interested in the reasons why,they are at the end of this e-mail).
Therefore, I am wondering whether a slighly different process mightwork better for us:
1. Run a full repair (we periodically run those anyway).
2. Mark all SSTables as repaired, even though they will include datathat has not been repaired yet because it was added while the repairprocess was running.
3. Run another full repair.
4. Start using incremental repairs (and the occasional full repair inorder to handle bit rot etc.).
If I understood the interactions between full repairs and incrementalrepairs correctly, step 3 should repair potential inconsistencies inthe SSTables that were marked as repaired in step 2 while avoiding theproblem of overstreaming that would happen when only marking thoseSSTables as repaired that already existed before step 1.
Does anyone see a flaw in this concept or has experience with asimilar scenario (migrating to incremental repairs in an environmentwith high-density nodes, where a single table contains most of the data)?
I am also interested in hearing about potential problems other C*users experienced when migrating to incremental repairs, so that weget a better idea what to expect.
Thanks,
Sebastian


Here is the explanation why I am being cautious:
More than 95 percent of our data is stored in a single table, and weuse high density nodes (storing about 3 TB of data per node). Thismeans that a full repair for the whole cluster takes about a week.
The reason for this layout is that most of our data is “cold”, meaningthat it is written once, never updated, and rarely deleted or read.However, new data is added continuously, so disabling autocompactionfor the duration of a full repair would lead to a high number of smallSSTables accumulating over the course of the week, and I am not surehow well the cluster would handle such a situation (and the increasedload when autocompaction is enabled again).

Re: Migrating to incremental repair in C* 4.x

Reply via email to