Okay, so I'm positively going crazy :)

Increasing gc_grace + repair + decreasing gc_grace didn't help. The columns
still appear after the repair. I checked in cassandra-cli and timestamps
for these columns are old, not in the future, so it shouldn't be the reason.

I also did a test: updated one of columns and it was indeed updated. Then
deleted it (and it was deleted), ran repair and its "updated" version
reappeared again! Why wouldn't these columns just go away? Is there any way
I can force their deletion permanently?

I also see this log entry on the node I'm running repair on, it mentions
the row that contains the reappearing columns:

INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936
CompactionController.java (line 192) Compacting large row
blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally

Can it be related to the issue?


On Tue, Mar 24, 2015 at 11:00 AM, Roman Tkachenko <ro...@mailgunhq.com>
wrote:

> Well, as I mentioned in my original email all machines running Cassandra
> are running NTP. This was one of the first things I verified and I triple
> checked that they all show the same time. Is this sufficient to ensure
> clocks are synched between the nodes?
>
> I have increased gc_grace to 100 days for now and am running repair on the
> affected keyspace, it should be done today. In the meanwhile if you (or
> anyone else) have other ideas / suggestions on how to debug this, they're
> much appreciated.
>
> Thanks for your help!
>
> Roman
>
> On Tue, Mar 24, 2015 at 10:39 AM, Duncan Sands <duncan.sa...@gmail.com>
> wrote:
>
>> Hi Roman,
>>
>> On 24/03/15 18:05, Roman Tkachenko wrote:
>>
>>> Hi Duncan,
>>>
>>> Thanks for the response!
>>>
>>> I can try increasing gc_grace_seconds and run repair on all nodes. It
>>> does not
>>> make sense though why all *new* deletes (for the same column that
>>> resurrects
>>> after repair) I do are forgotten as well after repair? Doesn't Cassandra
>>> insert
>>> a new tombstone every time delete happens?
>>>
>>
>> it does.  Maybe the data you are trying to delete has a timestamp
>> (writetime) in the future, for example because clocks aren't synchronized
>> between your nodes.
>>
>>
>>> Also, how do I find out the value to set gc_grace_seconds to?
>>>
>>
>> It needs to be big enough that you are sure to repair your entire cluster
>> in less than that time.  For example, observe how long repairing the entire
>> cluster takes and multiply by 3 or 4 (in case a repair fails or is
>> interrupted one day).
>>
>> Once incremental repair is solid maybe the whole gc_grace thing will
>> eventually go away, eg by modifying C* to only drop known repaired
>> tombstones.
>>
>> Ciao, Duncan.
>>
>>
>>> Thanks.
>>>
>>> On Tue, Mar 24, 2015 at 9:38 AM, Duncan Sands <duncan.sa...@gmail.com
>>> <mailto:duncan.sa...@gmail.com>> wrote:
>>>
>>>     Hi Roman,
>>>
>>>     On 24/03/15 17:32, Roman Tkachenko wrote:
>>>
>>>         Hey guys,
>>>
>>>         Has anyone seen anything like this behavior or has an
>>> explanation for it? If
>>>         not, I think I'm gonna file a bug report.
>>>
>>>
>>>     this can happen if repair is run after the tombstone gc_grace_period
>>> has
>>>     expired.  I suggest you increase gc_grace_period.
>>>
>>>     Ciao, Duncan.
>>>
>>>
>>>
>>
>

Reply via email to