Re: [cassandra 3.6.] Nodetool Repair + tombstone behaviour

Atul Saroha Thu, 29 Sep 2016 05:13:45 -0700

Thanks Alexander.

Will look into all these.


On Thu, Sep 29, 2016 at 4:39 PM, Alexander Dejanovski <
a...@thelastpickle.com> wrote:

> Atul,
>
> since you're using 3.6, by default you're running incremental repair,
> which doesn't like concurrency very much.
> Validation errors are not occurring on a partition or partition range
> base, but if you're trying to run both anticompaction and validation
> compaction on the same SSTable.
>
> Like advised to Robert yesterday, and if you want to keep on running
> incremental repair, I'd suggest the following :
>
>    - run nodetool tpstats on all nodes in search for running/pending
>    repair sessions
>    - If you have some, and to be sure you will avoid conflicts, roll
>    restart your cluster (all nodes)
>    - Then, run "nodetool repair" on one node.
>    - When repair has finished on this node (track messages in the log and
>    nodetool tpstats), check if other nodes are running anticompactions
>    - If so, wait until they are over
>    - If not, move on to the other node
>
> You should be able to run concurrent incremental compactions on different
> tables if you wish to speed up the complete repair of the cluster, but do
> not try to repair the same table/full keyspace from two nodes at the same
> time.
>
> If you do not want to keep on using incremental repair, and fallback to
> classic full repair, I think the only way in 3.6 to avoid anticompaction
> will be to use subrange repair (Paulo mentioned that in 3.x full repair
> also triggers anticompaction).
>
> You have two options here : cassandra_range_repair (https://github.com/
> BrianGallew/cassandra_range_repair) and Spotify Reaper (
> https://github.com/spotify/cassandra-reaper)
>
> cassandra_range_repair might scream about subrange + incremental not being
> compatible (not sure here), but you can modify the repair_range() method
> by adding a --full switch to the command line used to run repair.
>
> We have a fork of Reaper that handles both full subrange repair and
> incremental repair here : https://github.com/
> thelastpickle/cassandra-reaper
> It comes with a tweaked version of the UI made by Stephan Podkowinski (
> https://github.com/spodkowinski/cassandra-reaper-ui) - that eases
> interactions to schedule, run and track repair - which adds fields to run
> incremental repair (accessible via ...:8080/webui/ in your browser).
>
> Cheers,
>
>
>
> On Thu, Sep 29, 2016 at 12:33 PM Atul Saroha <atul.sar...@snapdeal.com>
> wrote:
>
>> Hi,
>>
>> We are not sure whether this issue is linked to that node or not. Our
>> application does frequent delete and insert.
>>
>> May be our approach is not correct for nodetool repair. Yes, we generally
>> fire repair on all boxes at same time. Till now, it was manual with default
>> configuration ( command: "nodetool repair").
>> Yes, we saw validation error but that is linked to already running repair
>> of  same partition on other box for same partition range. We saw error
>> validation failed with some ip as repair in already running for the same
>> SSTable.
>> Just few days back, we had 2 DCs with 3 nodes each and replication was
>> also 3. It means all data on each node.
>>
>> On Thu, Sep 29, 2016 at 2:49 PM, Alexander Dejanovski <
>> a...@thelastpickle.com> wrote:
>>
>>> Hi Atul,
>>>
>>> could you be more specific on how you are running repair ? What's the
>>> precise command line for that, does it run on several nodes at the same
>>> time, etc...
>>> What is your gc_grace_seconds ?
>>> Do you see errors in your logs that would be linked to repairs
>>> (Validation failure or failure to create a merkle tree)?
>>>
>>> You seem to mention a single node that went down but say the whole
>>> cluster seem to have zombie data.
>>> What is the connection you see between the node that went down and the
>>> fact that deleted data comes back to life ?
>>> What is your strategy for cyclic maintenance repair (schedule, command
>>> line or tool, etc...) ?
>>>
>>> Thanks,
>>>
>>> On Thu, Sep 29, 2016 at 10:40 AM Atul Saroha <atul.sar...@snapdeal.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We have seen a weird behaviour in cassandra 3.6.
>>>> Once our node was went down more than 10 hrs. After that, we had ran
>>>> Nodetool repair multiple times. But tombstone are not getting sync properly
>>>> over the cluster. On day- today basis, on expiry of every grace period,
>>>> deleted records start surfacing again in cassandra.
>>>>
>>>> It seems Nodetool repair in not syncing tomebstone across cluster.
>>>> FYI, we have 3 data centres now.
>>>>
>>>> Just want the help how to verify and debug this issue. Help will be
>>>> appreciated.
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Atul Saroha
>>>>
>>>> *Lead Software Engineer | CAMS*
>>>>
>>>> M: +91 8447784271
>>>> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
>>>> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>>>>
>>>> --
>>> -----------------
>>> Alexander Dejanovski
>>> France
>>> @alexanderdeja
>>>
>>> Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>
>>
>>
>> --
>> Regards,
>> Atul Saroha
>>
>> *Lead Software Engineer | CAMS*
>>
>> M: +91 8447784271
>> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
>> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>>
>> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>



-- 
Regards,
Atul Saroha

*Lead Software Engineer | CAMS*

M: +91 8447784271
Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
Udyog Vihar Phase IV,Gurgaon, Haryana, India

Re: [cassandra 3.6.] Nodetool Repair + tombstone behaviour

Reply via email to