Ok, I think we'll give incremental repairs a try on a limited number of CFs first and then if it goes well we'll progressively switch more CFs to incremental.
I'm not sure I understand the problem with anticompaction and validation running concurrently. As far as I can tell, right now when a CF is repaired (either via reaper, or via nodetool) there may be compactions running at the same time. In fact, it happens very often. Is it a problem ? As far as big partitions, the biggest one we have is around 3.3Gb. Some less big partitions are around 500Mb and less. On Thu, Oct 27, 2016, at 05:37 PM, Alexander Dejanovski wrote: > Oh right, that's what they advise :) > I'd say that you should skip the full repair phase in the migration > procedure as that will obviously fail, and just mark all sstables as > repaired (skip 1, 2 and 6). > Anyway you can't do better, so take a leap of faith there. > > Intensity is already very low and 10000 segments is a whole lot for 9 > nodes, you should not need that many. > > You can definitely pick which CF you'll run incremental repair on, and > still run full repair on the rest. > If you pick our Reaper fork, watch out for schema changes that add > incremental repair fields, and I do not advise to run incremental > repair without it, otherwise you might have issues with anticompaction > and validation compactions running concurrently from time to time. > > One last thing : can you check if you have particularly big partitions > in the CFs that fail to get repaired ? You can run nodetool > cfhistograms to check that. > > Cheers, > > > > On Thu, Oct 27, 2016 at 5:24 PM Vincent Rischmann > <m...@vrischmann.me> wrote: >> __ >> Thanks for the response. >> >> We do break up repairs between tables, we also tried our best to have >> no overlap between repair runs. Each repair has 10000 segments >> (purely arbitrary number, seemed to help at the time). Some runs have >> an intensity of 0.4, some have as low as 0.05. >> >> Still, sometimes one particular app (which does a lot of >> read/modify/write batches in quorum) gets slowed down to the point we >> have to stop the repair run. >> >> But more annoyingly, since 2 to 3 weeks as I said, it looks like runs >> don't progress after some time. Every time I restart reaper, it >> starts to repair correctly again, up until it gets stuck. I have no >> idea why that happens now, but it means I have to baby sit reaper, >> and it's becoming annoying. >> >> Thanks for the suggestion about incremental repairs. It would >> probably be a good thing but it's a little challenging to setup I >> think. Right now running a full repair of all keyspaces (via nodetool >> repair) is going to take a lot of time, probably like 5 days or more. >> We were never able to run one to completion. I'm not sure it's a good >> idea to disable autocompaction for that long. >> >> But maybe I'm wrong. Is it possible to use incremental repairs on >> some column family only ? >> >> >> On Thu, Oct 27, 2016, at 05:02 PM, Alexander Dejanovski wrote: >>> Hi Vincent, >>> >>> most people handle repair with : >>> - pain (by hand running nodetool commands) >>> - cassandra range repair : >>> https://github.com/BrianGallew/cassandra_range_repair >>> - Spotify Reaper >>> - and OpsCenter repair service for DSE users >>> >>> Reaper is a good option I think and you should stick to it. If it >>> cannot do the job here then no other tool will. >>> >>> You have several options from here : >>> * Try to break up your repair table by table and see which ones >>> actually get stuck >>> * Check your logs for any repair/streaming error >>> * Avoid repairing everything : >>> * you may have expendable tables >>> * you may have TTLed only tables with no deletes, accessed with >>> QUORUM CL only >>> * You can try to relieve repair pressure in Reaper by lowering >>> repair intensity (on the tables that get stuck) >>> * You can try adding steps to your repair process by putting a >>> higher segment count in reaper (on the tables that get stuck) >>> * And lastly, you can turn to incremental repair. As you're >>> familiar with Reaper already, you might want to take a look at >>> our Reaper fork that handles incremental repair : >>> https://github.com/thelastpickle/cassandra-reaper If you go down >>> that way, make sure you first mark all sstables as repaired >>> before you run your first incremental repair, otherwise you'll >>> end up in anticompaction hell (bad bad place) : >>> >>> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesMigration.html >>> Even if people say that's not necessary anymore, it'll save you >>> from a very bad first experience with incremental repair. >>> Furthermore, make sure you run repair daily after your first inc >>> repair run, in order to work on small sized repairs. >>> >>> Cheers, >>> >>> >>> On Thu, Oct 27, 2016 at 4:27 PM Vincent Rischmann <m...@vrischmann.me> >>> wrote: >>>> __ >>>> Hi, >>>> >>>> we have two Cassandra 2.1.15 clusters at work and are having some >>>> trouble with repairs. >>>> >>>> Each cluster has 9 nodes, and the amount of data is not gigantic >>>> but some column families have 300+Gb of data. >>>> We tried to use `nodetool repair` for these tables but at the time >>>> we tested it, it made the whole cluster load too much and it >>>> impacted our production apps. >>>> >>>> Next we saw https://github.com/spotify/cassandra-reaper , tried it >>>> and had some success until recently. Since 2 to 3 weeks it never >>>> completes a repair run, deadlocking itself somehow. >>>> >>>> I know DSE includes a repair service but I'm wondering how do other >>>> Cassandra users manage repairs ? >>>> >>>> Vincent. >>> -- >>> ----------------- >>> Alexander Dejanovski >>> France >>> @alexanderdeja >>> >>> Consultant >>> Apache Cassandra Consulting >>> http://www.thelastpickle.com[1] >> > -- > ----------------- > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com[2] Links: 1. http://www.thelastpickle.com/ 2. http://www.thelastpickle.com/