

You wrote that during peak hours you only have a couple hundred inserts per 
node so now I’m not sure why the default settings wouldn’t have worked just 
fine.  I sense there is more to the story.  What else could explain those 


From: Rahul Reddy [mailto:rahulreddy1...@gmail.com] 
Sent: Saturday, February 23, 2019 5:56 PM
To: user@cassandra.apache.org
Subject: Re: Tombstones in memtable


Changing gcgs didn't help


CREATE KEYSPACE ksname WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': '3', 'dc2': '3'}  AND durable_writes = true;



```CREATE TABLE keyspace."table" (

    "column1" text PRIMARY KEY,

    "column2" text

) WITH bloom_filter_fp_chance = 0.01

    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

    AND comment = ''

    AND compaction = {'class': 
'max_threshold': '32', 'min_threshold': '4'}

    AND compression = {'chunk_length_in_kb': '64', 'class': 

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 18000

    AND gc_grace_seconds = 60

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';


flushed table and took tsstabledump     

grep -i '"expired" : true' SSTables.txt|wc -l


grep -i '"expired" : false'  SSTables.txt |wc -l



ttl is 4 hours.


INSERT INTO keyspace."TABLE_NAME" ("column1", "column2") VALUES (?, ?) USING 
TTL(4hours) ?';

SELECT * FROM keyspace."TABLE_NAME" WHERE "column1" = ?';


metric to scan tombstones 



during peak hours. we only have couple of hundred inserts and 5-8k reads/s per 




            Read Count: 605231874

            Read Latency: 0.021268529760215503 ms.

            Write Count: 2763352

            Write Latency: 0.027924007871599422 ms.

            Pending Flushes: 0

                        Table: name

                        SSTable count: 1

                        Space used (live): 1413203

                        Space used (total): 1413203

                        Space used by snapshots (total): 0

                        Off heap memory used (total): 28813

                        SSTable Compression Ratio: 0.5015090954531143

                        Number of partitions (estimate): 19568

                        Memtable cell count: 573

                        Memtable data size: 22971

                        Memtable off heap memory used: 0

                        Memtable switch count: 6

                        Local read count: 529868919

                        Local read latency: 0.020 ms

                        Local write count: 2707371

                        Local write latency: 0.024 ms

                        Pending flushes: 0

                        Percent repaired: 0.0

                        Bloom filter false positives: 1

                        Bloom filter false ratio: 0.00000

                        Bloom filter space used: 23888

                        Bloom filter off heap memory used: 23880

                        Index summary off heap memory used: 4717

                        Compression metadata off heap memory used: 216

                        Compacted partition minimum bytes: 73

                        Compacted partition maximum bytes: 124

                        Compacted partition mean bytes: 99

                        Average live cells per slice (last five minutes): 1.0

                        Maximum live cells per slice (last five minutes): 1

                        Average tombstones per slice (last five minutes): 1.0

                        Maximum tombstones per slice (last five minutes): 1

                        Dropped Mutations: 0



Percentile  SSTables     Write Latency      Read Latency    Partition Size      
  Cell Count

                              (micros)          (micros)           (bytes)      

50%             000             20.50             17.08                86       

75%             0.00             24.60             20.50               124      

95%             0.00             35.43             29.52               124      

98%             0.00             35.43             42.51               124      

99%             0.00             42.51             51.01               124      

Min             0.00              8.24              5.72                73      

Max             1.00             42.51            152.32               124      



3 node in dc1 and 3 node in dc2 cluster. With instanc type aws  ec2 m4.xlarge


On Sat, Feb 23, 2019, 7:47 PM Jeff Jirsa <jji...@gmail.com> wrote:

Would also be good to see your schema (anonymized if needed) and the select 
queries you’re running



Jeff Jirsa


On Feb 23, 2019, at 4:37 PM, Rahul Reddy <rahulreddy1...@gmail.com> wrote:

Thanks Jeff,


I'm having gcgs set to 10 mins and changed the table ttl also to 5  hours 
compared to insert ttl to 4 hours .  Tracing on doesn't show any tombstone 
scans for the reads.  And also log doesn't show tombstone scan alerts. Has the 
reads are happening 5-8k reads per node during the peak hours it shows 1M 
tombstone scans count per read. 


On Fri, Feb 22, 2019, 11:46 AM Jeff Jirsa <jji...@gmail.com> wrote:

If all of your data is TTL’d and you never explicitly delete a cell without 
using s TTL, you can probably drop your GCGS to 1 hour (or less).


Which compaction strategy are you using? You need a way to clear out those 
tombstones. There exist tombstone compaction sub properties that can help 
encourage compaction to grab sstables just because they’re full of tombstones 
which will probably help you.



Jeff Jirsa


On Feb 22, 2019, at 8:37 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 

Can we see the histogram?  Why wouldn’t you at times have that many tombstones? 
 Makes sense.


Kenneth Brotman


From: Rahul Reddy [mailto:rahulreddy1...@gmail.com] 
Sent: Thursday, February 21, 2019 7:06 AM
To: user@cassandra.apache.org
Subject: Tombstones in memtable


We have small table records are about 5k .

All the inserts comes as 4hr ttl and we have table level ttl 1 day and gc grace 
seconds has 3 hours.  We do 5k reads a second during peak load During the peak 
load seeing Alerts for tomstone scanned histogram reaching million.

Cassandra version 3.11.1. Please let me know how can this tombstone scan can be 
avoided in memtable

Reply via email to