Hi Maxim.

Assuming all your update operations are successful and that you only delete 
data by TTL in that table, then you shouldn’t have to do repairs on it.

You may also consider to lower the gc_grace_seconds value on that table, but 
you should be aware of how this impacts hints and logged batches: 
https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateTable.html#tabProp__cqlTableGc_grace_seconds

/pelle

From: Maxim Parkachov <lazy.gop...@gmail.com>
Sent: den 20 augusti 2018 08:29
To: user@cassandra.apache.org
Subject: Re: Repair daily refreshed table

Hi Raul,

I cannot afford delete and then load as this will create downtime for the 
record, that's why I'm upserting with TTL today()+7days as I mentioted in my 
original question. And at the moment I don't have an issue either with loading 
nor with access times. My question is should I repair such table or not and if 
yes before load or after (or it doesn't matter) ?

Thanks,
Maxim.

On Sun, Aug 19, 2018 at 8:52 AM Rahul Singh 
<rahul.xavier.si...@gmail.com<mailto:rahul.xavier.si...@gmail.com>> wrote:
If you wanted to be certain that all replicas were acknowledging receipt of the 
data, then you could use ALL or EACH_QUORUM ( if you have multiple DCs) but you 
must really want high consistency if you do that.

You should avoid consciously creating tombstones if possible — it ends up 
making reads slower because they need to be accounted for until they are 
compacted / garbage collected out.

Tombstones are created when data is either deleted, or nulled. When marking 
data with a TTL , the actual delete is not done until after the TTL has expired.

When you say you are overwriting, are you deleting and then loading? That’s the 
only way you should see tombstones — or maybe you are setting nulls?

Rahul
On Aug 18, 2018, 11:16 PM -0700, Maxim Parkachov 
<lazy.gop...@gmail.com<mailto:lazy.gop...@gmail.com>>, wrote:
Hi Rahul,

I'm already using LOCAL_QUORUM in batch process and it runs every day. As far 
as I understand, because I'm overwriting whole table with new TTL, process 
creates tons of thumbstones and I'm more concerned with them.

Regards,
Maxim.
On Sun, Aug 19, 2018 at 3:02 AM Rahul Singh 
<rahul.xavier.si...@gmail.com<mailto:rahul.xavier.si...@gmail.com>> wrote:
Are you loading using a batch process? What’s the frequency of the data Ingest 
and does it have to very fast. If not too frequent and can be a little slower, 
you may consider a higher consistency to ensure data is on replicas.

Rahul
On Aug 18, 2018, 2:29 AM -0700, Maxim Parkachov 
<lazy.gop...@gmail.com<mailto:lazy.gop...@gmail.com>>, wrote:
Hi community,

I'm currently puzzled with following challenge. I have a CF with 7 days TTL on 
all rows. Daily there is a process which loads actual data with +7 days TTL. 
Thus records which are not present in last 7 days of load expired. Amount of 
these expired records are very small < 1%. I have daily repair process, which 
take considerable amount of time and resources, and snapshot after that. 
Obviously I'm concerned only with the last loaded data. Basically, my question: 
should I run repair before load, after load or maybe I don't need to repair 
such table at all ?

Regards,
Maxim.

Reply via email to