RE: [EXTERNAL] Re: Tombstone

Rahul Singh Thu, 21 Jun 2018 05:18:00 -0700

Queues can be implemented in Cassandra even though everyone believes its an 
“anti-pattern” if the design is designed for Cassandra’s model.


In this case, I would do a logical / soft delete on the data to invalidate it 
from a query that accesses it and put a TTL on the data so it deletes 
automatically later. You could have a default TTL or set a TTL on on your 
actual “delete” which would put the delete in the future for example 3 days 
from now.

Some sources of inspiration on how people have been doing queues on Cassandra

cherami by Uber
CMB by Comcast
cassieq — don’t remember.



--
Rahul Singh
rahul.si...@anant.us

Anant Corporation
On Jun 19, 2018, 12:39 PM -0400, Durity, Sean R <sean_r_dur...@homedepot.com>, 
wrote:
> This sounds like a queue pattern, which is typically an anti-pattern for 
> Cassandra. I would say that it is very difficult to get the access patterns, 
> tombstones, and everything else lined up properly to solve a queue problem.
>
>
> Sean Durity
>
> From: Abhishek Singh <abh23...@gmail.com>
> Sent: Tuesday, June 19, 2018 10:41 AM
> To: user@cassandra.apache.org
> Subject: [EXTERNAL] Re: Tombstone
>
>                        The Partition key is made of datetime(basically date 
> truncated to hour) and bucket.I think your RCA may be correct since we are 
> deleting the partition rows one by one not in a batch files maybe overlapping 
> for the particular partition.A scheduled thread picks the rows for a 
> partition based on current datetime and bucket number and checks whether for 
> each row the entiry is past due or not, if yes we trigger a event and remove 
> the entry.
>
>
>
> On Tue 19 Jun, 2018, 7:58 PM Jeff Jirsa, <jji...@gmail.com> wrote:
> > The most likely explanation is tombstones in files that won’t be collected 
> > as they potentially overlap data in other files with a lower timestamp 
> > (especially true if your partition key doesn’t change and you’re writing 
> > and deleting data within a partition)
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Jun 19, 2018, at 3:28 AM, Abhishek Singh <abh23...@gmail.com> wrote:
> > >
> > > Hi all,
> > >            We using Cassandra for storing events which are time series 
> > >based for batch processing once a particular batch based on hour is 
> > >processed we delete the entries but we were left with almost 18% deletes 
> > >marked as Tombstones.
> > >                  I ran compaction on the particular CF tombstone didn't 
> > >come down.
> > >             Can anyone suggest what is the optimal tunning/recommended 
> > >practice used for compaction strategy and GC_grace period with 100k 
> > >entries and deletes every hour.
> > >
> > > Warm Regards
> > > Abhishek Singh
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org

RE: [EXTERNAL] Re: Tombstone

Reply via email to