[ https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526070#comment-16526070 ]
Stefan Podkowinski commented on CASSANDRA-14423: ------------------------------------------------ [~KurtG], do you mind to go on by committing 2.2/3.0/3.11 patches now and address trunk in a separate ticket? > SSTables stop being compacted > ----------------------------- > > Key: CASSANDRA-14423 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14423 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Reporter: Kurt Greaves > Assignee: Kurt Greaves > Priority: Blocker > Fix For: 2.2.13, 3.0.17, 3.11.3 > > > So seeing a problem in 3.11.0 where SSTables are being lost from the view and > not being included in compactions/as candidates for compaction. It seems to > get progressively worse until there's only 1-2 SSTables in the view which > happen to be the most recent SSTables and thus compactions completely stop > for that table. > The SSTables seem to still be included in reads, just not compactions. > The issue can be fixed by restarting C*, as it will reload all SSTables into > the view, but this is only a temporary fix. User defined/major compactions > still work - not clear if they include the result back in the view but is not > a good work around. > This also results in a discrepancy between SSTable count and SSTables in > levels for any table using LCS. > {code:java} > Keyspace : xxx > Read Count: 57761088 > Read Latency: 0.10527088681224288 ms. > Write Count: 2513164 > Write Latency: 0.018211106398149903 ms. > Pending Flushes: 0 > Table: xxx > SSTable count: 10 > SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0] > Space used (live): 894498746 > Space used (total): 894498746 > Space used by snapshots (total): 0 > Off heap memory used (total): 11576197 > SSTable Compression Ratio: 0.6956629530569777 > Number of keys (estimate): 3562207 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 87 > Local read count: 57761088 > Local read latency: 0.108 ms > Local write count: 2513164 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 86.33 > Bloom filter false positives: 43 > Bloom filter false ratio: 0.00000 > Bloom filter space used: 8046104 > Bloom filter off heap memory used: 8046024 > Index summary off heap memory used: 3449005 > Compression metadata off heap memory used: 81168 > Compacted partition minimum bytes: 104 > Compacted partition maximum bytes: 5722 > Compacted partition mean bytes: 175 > Average live cells per slice (last five minutes): 1.0 > Maximum live cells per slice (last five minutes): 1 > Average tombstones per slice (last five minutes): 1.0 > Maximum tombstones per slice (last five minutes): 1 > Dropped Mutations: 0 > {code} > Also for STCS we've confirmed that SSTable count will be different to the > number of SSTables reported in the Compaction Bucket's. In the below example > there's only 3 SSTables in a single bucket - no more are listed for this > table. Compaction thresholds haven't been modified for this table and it's a > very basic KV schema. > {code:java} > Keyspace : yyy > Read Count: 30485 > Read Latency: 0.06708991307200263 ms. > Write Count: 57044 > Write Latency: 0.02204061776873992 ms. > Pending Flushes: 0 > Table: yyy > SSTable count: 19 > Space used (live): 18195482 > Space used (total): 18195482 > Space used by snapshots (total): 0 > Off heap memory used (total): 747376 > SSTable Compression Ratio: 0.7607394576769735 > Number of keys (estimate): 116074 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 39 > Local read count: 30485 > Local read latency: NaN ms > Local write count: 57044 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 79.76 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.00000 > Bloom filter space used: 690912 > Bloom filter off heap memory used: 690760 > Index summary off heap memory used: 54736 > Compression metadata off heap memory used: 1880 > Compacted partition minimum bytes: 73 > Compacted partition maximum bytes: 124 > Compacted partition mean bytes: 96 > Average live cells per slice (last five minutes): NaN > Maximum live cells per slice (last five minutes): 0 > Average tombstones per slice (last five minutes): NaN > Maximum tombstones per slice (last five minutes): 0 > Dropped Mutations: 0 > {code} > {code:java} > Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy > Compaction buckets are > [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67168-big-Data.db'), > > BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67167-big-Data.db'), > > BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67166-big-Data.db')]] > {code} > Also for every LCS table we're seeing the following warning being spammed > (seems to be in line with anticompaction spam): > {code:java} > Apr 26 21:30:09 cassandra[9263]: WARNÂ o.a.c.d.c.LeveledCompactionStrategy > Live sstable > /var/lib/cassandra/data/xxx/xxx-8c3ef9e0e3fc11e6868a8fd39a64fd59/mc-79024-big-Data.db > from level 0 is not on corresponding level in the leveled manifest. This is > not a problem per se, but may indicate an orphaned sstable due to a failed > compaction not cleaned up properly.{code} > This is a vnodes cluster with 256 tokens per node, and the only thing that > seems like it could be causing issues is anticompactions. > CASSANDRA-14079 might be related but doesn't quite describe the same issue, > and in this case we're using only a single disk for data. Have yet to > reproduce but figured worth reporting here first. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org