Can i set gc_grace_seconds to 0 in this case? because reappearing deleted data 
has no impact on my Business Logic, i'm just either creating a new row or 
replacing the exactly same row. 



Sent using Zoho Mail






---- On Wed, 13 Jun 2018 03:41:51 +0430 Elliott Sims 
<elli...@backblaze.com> wrote ----




If this is data that expires after a certain amount of time, you probably want 
to look into using TWCS and TTLs to minimize the number of tombstones.


Decreasing gc_grace_seconds then compacting will reduce the number of 
tombstones, but at the cost of potentially resurrecting deleted data if the 
table hasn't been repaired during the grace interval.  You can also just 
increase the tombstone thresholds, but the queries will be pretty 
expensive/wasteful.




On Tue, Jun 12, 2018 at 2:02 AM, onmstester onmstester 
<onmstes...@zoho.com> wrote:








Hi, 



I needed to save a distinct value for a key in each hour, the problem with 
saving everything and computing distincts in memory is that there

are too many repeated data.

Table schema:

Table distinct(

hourNumber int,

key text,

distinctValue long

primary key (hourNumber)

)



I want to retrieve distinct count of all keys in a specific hour and using this 
data model it would be achieved by reading a single partition.

The problem : i can't read from this table, system.log indicates that more than 
100K tombstones read and no live data in it. The gc_grace time is

the default (10 days), so i thought decreasing it to 1 hour and run compaction, 
but is this a right approach at all? i mean the whole idea of replacing

some millions of rows. each  10 times in a partition again and again that 
creates alot of tombstones just to achieve distinct behavior?



Thanks in advance



Sent using Zoho Mail










Reply via email to