Re: Tombstones in memtable

Jeff Jirsa Sat, 23 Feb 2019 18:27:03 -0800

Your schema is such that you’ll never read more than one tombstone per select 
(unless you’re also doing range reads / table scans that you didn’t mention) - 
I’m not quite sure what you’re alerting on, but you’re not going to have 
tombstone problems with that table / that select.


-- 
Jeff Jirsa


> On Feb 23, 2019, at 5:55 PM, Rahul Reddy <rahulreddy1...@gmail.com> wrote:
> 
> Changing gcgs didn't help
> 
> CREATE KEYSPACE ksname WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> 
> 
> ```CREATE TABLE keyspace."table" (
>     "column1" text PRIMARY KEY,
>     "column2" text
> ) WITH bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND comment = ''
>     AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
>     AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 18000
>     AND gc_grace_seconds = 60
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
> 
> flushed table and took tsstabledump     
> grep -i '"expired" : true' SSTables.txt|wc -l
> 16439
> grep -i '"expired" : false'  SSTables.txt |wc -l
> 2657
> 
> ttl is 4 hours.
> 
> INSERT INTO keyspace."TABLE_NAME" ("column1", "column2") VALUES (?, ?) USING 
> TTL(4hours) ?';
> SELECT * FROM keyspace."TABLE_NAME" WHERE "column1" = ?';
> 
> metric to scan tombstones 
> increase(cassandra_Table_TombstoneScannedHistogram{keyspace="mykeyspace",Table="tablename",function="Count"}[5m])
> 
> during peak hours. we only have couple of hundred inserts and 5-8k reads/s 
> per node.
> ```
> 
> ```tablestats
>       Read Count: 605231874
>       Read Latency: 0.021268529760215503 ms.
>       Write Count: 2763352
>       Write Latency: 0.027924007871599422 ms.
>       Pending Flushes: 0
>               Table: name
>               SSTable count: 1
>               Space used (live): 1413203
>               Space used (total): 1413203
>               Space used by snapshots (total): 0
>               Off heap memory used (total): 28813
>               SSTable Compression Ratio: 0.5015090954531143
>               Number of partitions (estimate): 19568
>               Memtable cell count: 573
>               Memtable data size: 22971
>               Memtable off heap memory used: 0
>               Memtable switch count: 6
>               Local read count: 529868919
>               Local read latency: 0.020 ms
>               Local write count: 2707371
>               Local write latency: 0.024 ms
>               Pending flushes: 0
>               Percent repaired: 0.0
>               Bloom filter false positives: 1
>               Bloom filter false ratio: 0.00000
>               Bloom filter space used: 23888
>               Bloom filter off heap memory used: 23880
>               Index summary off heap memory used: 4717
>               Compression metadata off heap memory used: 216
>               Compacted partition minimum bytes: 73
>               Compacted partition maximum bytes: 124
>               Compacted partition mean bytes: 99
>               Average live cells per slice (last five minutes): 1.0
>               Maximum live cells per slice (last five minutes): 1
>               Average tombstones per slice (last five minutes): 1.0
>               Maximum tombstones per slice (last five minutes): 1
>               Dropped Mutations: 0
>               
>               histograms
> Percentile  SSTables     Write Latency      Read Latency    Partition Size    
>     Cell Count
>                               (micros)          (micros)           (bytes)    
>               
> 50%             0.00             20.50             17.08                86    
>              1
> 75%             0.00             24.60             20.50               124    
>              1
> 95%             0.00             35.43             29.52               124    
>              1
> 98%             0.00             35.43             42.51               124    
>              1
> 99%             0.00             42.51             51.01               124    
>              1
> Min             0.00              8.24              5.72                73    
>              0
> Max             1.00             42.51            152.32               124    
>              1
> ```
> 
> 3 node in dc1 and 3 node in dc2 cluster. With instanc type aws  ec2 m4.xlarge
> 
>> On Sat, Feb 23, 2019, 7:47 PM Jeff Jirsa <jji...@gmail.com> wrote:
>> Would also be good to see your schema (anonymized if needed) and the select 
>> queries you’re running
>> 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Feb 23, 2019, at 4:37 PM, Rahul Reddy <rahulreddy1...@gmail.com> wrote:
>>> 
>>> Thanks Jeff,
>>> 
>>> I'm having gcgs set to 10 mins and changed the table ttl also to 5  hours 
>>> compared to insert ttl to 4 hours .  Tracing on doesn't show any tombstone 
>>> scans for the reads.  And also log doesn't show tombstone scan alerts. Has 
>>> the reads are happening 5-8k reads per node during the peak hours it shows 
>>> 1M tombstone scans count per read. 
>>> 
>>>> On Fri, Feb 22, 2019, 11:46 AM Jeff Jirsa <jji...@gmail.com> wrote:
>>>> If all of your data is TTL’d and you never explicitly delete a cell 
>>>> without using s TTL, you can probably drop your GCGS to 1 hour (or less).
>>>> 
>>>> Which compaction strategy are you using? You need a way to clear out those 
>>>> tombstones. There exist tombstone compaction sub properties that can help 
>>>> encourage compaction to grab sstables just because they’re full of 
>>>> tombstones which will probably help you.
>>>> 
>>>> 
>>>> -- 
>>>> Jeff Jirsa
>>>> 
>>>> 
>>>>> On Feb 22, 2019, at 8:37 AM, Kenneth Brotman 
>>>>> <kenbrot...@yahoo.com.invalid> wrote:
>>>>> 
>>>>> Can we see the histogram?  Why wouldn’t you at times have that many 
>>>>> tombstones?  Makes sense.
>>>>> 
>>>>>  
>>>>> 
>>>>> Kenneth Brotman
>>>>> 
>>>>>  
>>>>> 
>>>>> From: Rahul Reddy [mailto:rahulreddy1...@gmail.com] 
>>>>> Sent: Thursday, February 21, 2019 7:06 AM
>>>>> To: user@cassandra.apache.org
>>>>> Subject: Tombstones in memtable
>>>>> 
>>>>>  
>>>>> 
>>>>> We have small table records are about 5k .
>>>>> 
>>>>> All the inserts comes as 4hr ttl and we have table level ttl 1 day and gc 
>>>>> grace seconds has 3 hours.  We do 5k reads a second during peak load 
>>>>> During the peak load seeing Alerts for tomstone scanned histogram 
>>>>> reaching million.
>>>>> 
>>>>> Cassandra version 3.11.1. Please let me know how can this tombstone scan 
>>>>> can be avoided in memtable

Re: Tombstones in memtable

Reply via email to