Vidur Malik created CASSANDRA-10294:
---------------------------------------

             Summary: Old SSTables lying around
                 Key: CASSANDRA-10294
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10294
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: Stand-alone cluster deployed on AWS EC2 instances
            Reporter: Vidur Malik
         Attachments: Screen Shot 2015-09-09 at 9.32.53 AM.png

We're running a Cassandra 2.2.0 cluster with 8 nodes. We are doing frequent 
updates to our data and we have very few reads, and we are using Leveled 
Compaction with a sstable_size_in_mb of 160MB. We don't have that much data 
currently since we're just testing the cluster.
We are seeing the SSTable count linearly increase (see attached graph, each 
line is a node in the cluster) even though `nodetool compactionhistory` shows 
that compactions have definitely run. When I ran nodetool cfstats, I get the 
following output:
Table: tender_summaries
SSTable count: 56
SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0]

Does it make sense that there is such a huge difference between the number of 
SStables in each level and the total count of SStables? It seems like old 
SSTables are lying around and never cleaned-up/compacted.
Schema is relatively simple:
CREATE TABLE IF NOT EXISTS reporting.tender_summaries (
organization_id uuid,
date timestamp,
year int,
location_id varchar,
operation_type varchar,
reference_id varchar,
field1 int,
field2 int,
field3 int,
field4 int,
PRIMARY KEY((organization_id, year), location_id, date, operation_type, 
reference_id)
) WITH CLUSTERING ORDER BY (location_id DESC, date DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '
{"keys":"ALL", "rows_per_partition":"NONE"}
'
AND comment = ''
AND compaction =
{'sstable_size_in_mb': '160', 'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression =
{'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to