Hi Raul, I cannot afford delete and then load as this will create downtime for the record, that's why I'm upserting with TTL today()+7days as I mentioted in my original question. And at the moment I don't have an issue either with loading nor with access times. My question is should I repair such table or not and if yes before load or after (or it doesn't matter) ?
Thanks, Maxim. On Sun, Aug 19, 2018 at 8:52 AM Rahul Singh <rahul.xavier.si...@gmail.com> wrote: > If you wanted to be certain that all replicas were acknowledging receipt > of the data, then you could use ALL or EACH_QUORUM ( if you have multiple > DCs) but you must really want high consistency if you do that. > > You should avoid consciously creating tombstones if possible — it ends up > making reads slower because they need to be accounted for until they are > compacted / garbage collected out. > > Tombstones are created when data is either deleted, or nulled. When > marking data with a TTL , the actual delete is not done until after the TTL > has expired. > > When you say you are overwriting, are you deleting and then loading? > That’s the only way you should see tombstones — or maybe you are setting > nulls? > > Rahul > On Aug 18, 2018, 11:16 PM -0700, Maxim Parkachov <lazy.gop...@gmail.com>, > wrote: > > Hi Rahul, > > I'm already using LOCAL_QUORUM in batch process and it runs every day. As > far as I understand, because I'm overwriting whole table with new TTL, > process creates tons of thumbstones and I'm more concerned with them. > > Regards, > Maxim. > > On Sun, Aug 19, 2018 at 3:02 AM Rahul Singh <rahul.xavier.si...@gmail.com> > wrote: > >> Are you loading using a batch process? What’s the frequency of the data >> Ingest and does it have to very fast. If not too frequent and can be a >> little slower, you may consider a higher consistency to ensure data is on >> replicas. >> >> Rahul >> On Aug 18, 2018, 2:29 AM -0700, Maxim Parkachov <lazy.gop...@gmail.com>, >> wrote: >> >> Hi community, >> >> I'm currently puzzled with following challenge. I have a CF with 7 days >> TTL on all rows. Daily there is a process which loads actual data with +7 >> days TTL. Thus records which are not present in last 7 days of load >> expired. Amount of these expired records are very small < 1%. I have daily >> repair process, which take considerable amount of time and resources, and >> snapshot after that. Obviously I'm concerned only with the last loaded >> data. Basically, my question: should I run repair before load, after load >> or maybe I don't need to repair such table at all ? >> >> Regards, >> Maxim. >> >>