Re: Tools to manage repairs

Vincent Rischmann Thu, 27 Oct 2016 12:28:31 -0700

Yeah that particular table is badly designed, I intend to fix it, when
the roadmap allows us to do it :)
What is the recommended maximum partition size ?


Thanks for all the information.


On Thu, Oct 27, 2016, at 08:14 PM, Alexander Dejanovski wrote:
> 3.3GB is already too high, and it's surely not good to have well
>   performing compactions. Still I know changing a data model is no
>   easy thing to do, but you should try to do something here.
> Anticompaction is a special type of compaction and if an sstable is
> being anticompacted, then any attempt to run validation compaction on
> it will fail, telling you that you cannot have an sstable being part
> of 2 repair sessions at the same time, so incremental repair must be
> run one node at a time, waiting for anticompactions to end before
> moving from one node to the other.
> Be mindful of running incremental repair on a regular basis once you
> started as you'll have two separate pools of sstables (repaired and
> unrepaired) that won't get compacted together, which could be a
> problem if you want tombstones to be purged efficiently.
> Cheers,
>
> Le jeu. 27 oct. 2016 17:57, Vincent Rischmann <m...@vrischmann.me>
> a écrit :
>> __
>> Ok, I think we'll give incremental repairs a try on a limited number
>> of CFs first and then if it goes well we'll progressively switch more
>> CFs to incremental.
>>
>> I'm not sure I understand the problem with anticompaction and
>> validation running concurrently. As far as I can tell, right now when
>> a CF is repaired (either via reaper, or via nodetool) there may be
>> compactions running at the same time. In fact, it happens very often.
>> Is it a problem ?
>>
>> As far as big partitions, the biggest one we have is around 3.3Gb.
>> Some less big partitions are around 500Mb and less.
>>
>>
>> On Thu, Oct 27, 2016, at 05:37 PM, Alexander Dejanovski wrote:
>>> Oh right, that's what they advise :)
>>> I'd say that you should skip the full repair phase in the migration
>>> procedure as that will obviously fail, and just mark all sstables as
>>> repaired (skip 1, 2 and 6).
>>> Anyway you can't do better, so take a leap of faith there.
>>>
>>> Intensity is already very low and 10000 segments is a whole lot for
>>> 9 nodes, you should not need that many.
>>>
>>> You can definitely pick which CF you'll run incremental repair on,
>>> and still run full repair on the rest.
>>> If you pick our Reaper fork, watch out for schema changes that add
>>> incremental repair fields, and I do not advise to run incremental
>>> repair without it, otherwise you might have issues with
>>> anticompaction and validation compactions running concurrently from
>>> time to time.
>>>
>>> One last thing : can you check if you have particularly big
>>> partitions in the CFs that fail to get repaired ? You can run
>>> nodetool cfhistograms to check that.
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On Thu, Oct 27, 2016 at 5:24 PM Vincent Rischmann <m...@vrischmann.me>
>>> wrote:
>>>> __
>>>> Thanks for the response.
>>>>
>>>> We do break up repairs between tables, we also tried our best to
>>>> have no overlap between repair runs. Each repair has 10000 segments
>>>> (purely arbitrary number, seemed to help at the time). Some runs
>>>> have an intensity of 0.4, some have as low as 0.05.
>>>>
>>>> Still, sometimes one particular app (which does a lot of
>>>> read/modify/write batches in quorum) gets slowed down to the point
>>>> we have to stop the repair run.
>>>>
>>>> But more annoyingly, since 2 to 3 weeks as I said, it looks like
>>>> runs don't progress after some time. Every time I restart reaper,
>>>> it starts to repair correctly again, up until it gets stuck. I have
>>>> no idea why that happens now, but it means I have to baby sit
>>>> reaper, and it's becoming annoying.
>>>>
>>>> Thanks for the suggestion about incremental repairs. It would
>>>> probably be a good thing but it's a little challenging to setup I
>>>> think. Right now running a full repair of all keyspaces (via
>>>> nodetool repair) is going to take a lot of time, probably like 5
>>>> days or more. We were never able to run one to completion. I'm not
>>>> sure it's a good idea to disable autocompaction for that long.
>>>>
>>>> But maybe I'm wrong. Is it possible to use incremental repairs on
>>>> some column family only ?
>>>>
>>>>
>>>> On Thu, Oct 27, 2016, at 05:02 PM, Alexander Dejanovski wrote:
>>>>> Hi Vincent,
>>>>>
>>>>> most people handle repair with :
>>>>> - pain (by hand running nodetool commands)
>>>>> - cassandra range repair :
>>>>>   https://github.com/BrianGallew/cassandra_range_repair
>>>>> - Spotify Reaper
>>>>> - and OpsCenter repair service for DSE users
>>>>>
>>>>> Reaper is a good option I think and you should stick to it. If it
>>>>> cannot do the job here then no other tool will.
>>>>>
>>>>> You have several options from here :
>>>>>  * Try to break up your repair table by table and see which ones
>>>>>    actually get stuck
>>>>>  * Check your logs for any repair/streaming error
>>>>>  * Avoid repairing everything :
>>>>>    * you may have expendable tables
>>>>>    * you may have TTLed only tables with no deletes, accessed with
>>>>>      QUORUM CL only
>>>>>  * You can try to relieve repair pressure in Reaper by lowering
>>>>>    repair intensity (on the tables that get stuck)
>>>>>  * You can try adding steps to your repair process by putting a
>>>>>    higher segment count in reaper (on the tables that get stuck)
>>>>>  * And lastly, you can turn to incremental repair. As you're
>>>>>    familiar with Reaper already, you might want to take a look at
>>>>>    our Reaper fork that handles incremental repair :
>>>>>    https://github.com/thelastpickle/cassandra-reaper If you go
>>>>>    down that way, make sure you first mark all sstables as
>>>>>    repaired before you run your first incremental repair,
>>>>>    otherwise you'll end up in anticompaction hell (bad bad place)
>>>>>    :
>>>>>    
>>>>> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesMigration.html
>>>>>    Even if people say that's not necessary anymore, it'll save you
>>>>>    from a very bad first experience with incremental repair.
>>>>>    Furthermore, make sure you run repair daily after your first
>>>>>    inc repair run, in order to work on small sized repairs.
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>> On Thu, Oct 27, 2016 at 4:27 PM Vincent Rischmann
>>>>> <m...@vrischmann.me> wrote:
>>>>>> __
>>>>>> Hi,
>>>>>>
>>>>>> we have two Cassandra 2.1.15 clusters at work and are having some
>>>>>> trouble with repairs.
>>>>>>
>>>>>> Each cluster has 9 nodes, and the amount of data is not gigantic
>>>>>> but some column families have 300+Gb of data.
>>>>>> We tried to use `nodetool repair` for these tables but at the
>>>>>> time we tested it, it made the whole cluster load too much and it
>>>>>> impacted our production apps.
>>>>>>
>>>>>> Next we saw https://github.com/spotify/cassandra-reaper , tried
>>>>>> it and had some success until recently. Since 2 to 3 weeks it
>>>>>> never completes a repair run, deadlocking itself somehow.
>>>>>>
>>>>>> I know DSE includes a repair service but I'm wondering how do
>>>>>> other Cassandra users manage repairs ?
>>>>>>
>>>>>> Vincent.
>>>>> --
>>>>> -----------------
>>>>> Alexander Dejanovski
>>>>> France
>>>>> @alexanderdeja
>>>>>
>>>>> Consultant
>>>>> Apache Cassandra Consulting
>>>>> http://www.thelastpickle.com[1]
>>>>
>>> --
>>> -----------------
>>> Alexander Dejanovski
>>> France
>>> @alexanderdeja
>>>
>>> Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com[2]
>>
> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com[3]


Links:

  1. http://www.thelastpickle.com/
  2. http://www.thelastpickle.com/
  3. http://www.thelastpickle.com/

Re: Tools to manage repairs

Reply via email to