Hi,

In a distributed system, such as Cassandra, things can happen (node down,
stop the world GC, hardware issue, ...) and desynchronize replicas, isn't
repairing also a needed operation to keep replicas up to date at least once
a week or once a month ? It is a strong and reliable process to keep things
synced, isn't it ?

I know that read repairs and hinted handoff are also there to handle this
kind of issues, but they might fail (I saw a lot of error in the logs
around hints not being delivered - some people even disable them - and read
repairs are often configured to trigger on 10% of the reads).


2014-01-28 14:53 GMT+01:00 Sylvain Lebresne <sylv...@datastax.com>:

>
>> I have actually set up one of our application streams such that the same
>> key is only overwritten with a monotonically increasing ttl.
>>
>> For example, a breaking news item might have an initial ttl of 60
>> seconds, followed in 45 seconds by an update with a ttl of 3000 seconds,
>> followed by an 'ignore me' update in 600 seconds with a ttl of 30 days (our
>> maximum ttl) when the article is published.
>>
>> My understanding is that this case fits the criteria and no 'periodic
>> repair' is needed.
>>
>
> That's correct. The real criteria for not needing repair if you do no
> deletes but only TTL is "update only with monotonically increasing (non
> necessarily strictly) ttl". Always setting the same TTL is just a special
> case of that, but it's the most commonly used one I think, so I tend to
> simplify it to that case.
>
>
>>
>> I guess another thing I would point out that is easy to miss or forget
>> (if you are a newish user like me), is that ttl's are fine-grained, by
>> column. So we are talking 'fixed' or 'variable' by individual column, not
>> by table. Which means, in my case, that ttl's can vary widely across a
>> table, but as long as I constrain them by key value to be fixed or
>> monotonically increasing, it fits the criteria.
>>
>
> We're talking monotonically increasing ttl "for a given primary key' if
> we're talking the CQL language and "for a given column" if we're talking
> the thrift one. Not "by table".
>
> --
> Sylvain
>
>
>
>>
>> Cheers,
>>
>> Michael
>>
>>
>> On Tue, Jan 28, 2014 at 4:18 AM, Sylvain Lebresne 
>> <sylv...@datastax.com>wrote:
>>
>>> On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo 
>>> <edlinuxg...@gmail.com>wrote:
>>>
>>>> If you have only ttl columns, and you never update the column I would
>>>> not think you need a repair.
>>>>
>>>
>>> Right, no deletes and no updates is the case 1. of Michael on which I
>>> think we all agree 'periodic repair to avoid resurrected columns' is not
>>> required.
>>>
>>>
>>>>
>>>> Repair cures lost deletes. If all your writes have a ttl a lost write
>>>> should not matter since the column was never written to the node and thus
>>>> could never be resurected on said node.
>>>>
>>>
>>>  I'm sure we're all in agreement here, but for the record, this is only
>>> true if you have no updates (overwrites) and/or if all writes have the
>>> *same* ttl. Because in the general case, a column with a relatively short
>>> TTL is basically very close to a delete, while a column with a long TTL is
>>> very close from one that has no TTL. If the former column (with short TTL)
>>> overwrites the latter one (with long TTL), and if one nodes misses the
>>> overwrite, that node could resurrect the column with the longer TTL (until
>>> that column expires that is). Hence the separation of the case 2. (fixed
>>> ttl, no repair needed) and 2.a. (variable ttl, repair may be needed).
>>>
>>> --
>>> Sylvain
>>>
>>>
>>>>
>>>> Unless i am missing something.
>>>>
>>>> On Monday, January 27, 2014, Laing, Michael <michael.la...@nytimes.com>
>>>> wrote:
>>>> > Thanks Sylvain,
>>>> > Your assumption is correct!
>>>> > So I think I actually have 4 classes:
>>>> > 1.    Regular values, no deletes, no overwrites, write heavy,
>>>> variable ttl's to manage size
>>>> > 2.    Regular values, no deletes, some overwrites, read heavy (10 to
>>>> 1), fixed ttl's to manage size
>>>> > 2.a. Regular values, no deletes, some overwrites, read heavy (10 to
>>>> 1), variable ttl's to manage size
>>>> > 3.    Counter values, no deletes, update heavy, rotation/truncation
>>>> to manage size
>>>> > Only 2.a. above requires me to do 'periodic repair'.
>>>> > What I will actually do is change my schema and applications slightly
>>>> to eliminate the need for overwrites on the only table I have in that
>>>> category.
>>>> > And I will set gc_grace_period to 0 for the tables in the updated
>>>> schema and drop 'periodic repair' from the schedule.
>>>> > Cheers,
>>>> > Michael
>>>> >
>>>> >
>>>> > On Mon, Jan 27, 2014 at 4:22 AM, Sylvain Lebresne <
>>>> sylv...@datastax.com> wrote:
>>>> >>
>>>> >> By periodic repair, I'll assume you mean "having to run repair every
>>>> gc_grace period to make sure no deleted entries resurrect". With that
>>>> assumption:
>>>> >>
>>>> >>>
>>>> >>> 1. Regular values, no deletes, no overwrites, write heavy, ttl's to
>>>> manage size
>>>> >>
>>>> >> Since 'repair within gc_grace' is about avoiding value that have
>>>> been deleted to resurrect, if you do no delete nor overwrites, you're in no
>>>> risk of that (and don't need to 'repair withing gc_grace').
>>>> >>
>>>> >>>
>>>> >>> 2. Regular values, no deletes, some overwrites, read heavy (10 to
>>>> 1), ttl's to manage size
>>>> >>
>>>> >> It depends a bit. In general, if you always set the exact same TTL
>>>> on every insert (implying you always set a TTL), then you have nothing to
>>>> worry about. If the TTL varies (of if you only set TTL some of the times),
>>>> then you might still need to have some periodic repairs. That being said,
>>>> if there is no deletes but only TTLs, then the TTL kind of lengthen the
>>>> period at which you need to do repair: instead of needing to repair withing
>>>> gc_grace, you only need to repair every gc_grace + min(TTL) (where min(TTL)
>>>> is the smallest TTL you set on columns).
>>>> >>>
>>>> >>> 3. Counter values, no deletes, update heavy, rotation/truncation to
>>>> manage size
>>>> >>
>>>> >> No deletes and no TTL implies that your fine (as in, there is no
>>>> need for 'repair withing gc_grace').
>>>> >>
>>>> >> --
>>>> >> Sylvain
>>>> >
>>>>
>>>> --
>>>> Sorry this was sent from mobile. Will do less grammar and spell check
>>>> than usual.
>>>>
>>>
>>>
>>
>

Reply via email to