[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526070#comment-16526070
 ] 

Stefan Podkowinski commented on CASSANDRA-14423:
------------------------------------------------

[~KurtG], do you mind to go on by committing 2.2/3.0/3.11 patches now and 
address trunk in a separate ticket?

> SSTables stop being compacted
> -----------------------------
>
>                 Key: CASSANDRA-14423
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Kurt Greaves
>            Assignee: Kurt Greaves
>            Priority: Blocker
>             Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.00000
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
>     Read Count: 30485
>     Read Latency: 0.06708991307200263 ms.
>     Write Count: 57044
>     Write Latency: 0.02204061776873992 ms.
>     Pending Flushes: 0
>         Table: yyy
>         SSTable count: 19
>         Space used (live): 18195482
>         Space used (total): 18195482
>         Space used by snapshots (total): 0
>         Off heap memory used (total): 747376
>         SSTable Compression Ratio: 0.7607394576769735
>         Number of keys (estimate): 116074
>         Memtable cell count: 0
>         Memtable data size: 0
>         Memtable off heap memory used: 0
>         Memtable switch count: 39
>         Local read count: 30485
>         Local read latency: NaN ms
>         Local write count: 57044
>         Local write latency: NaN ms
>         Pending flushes: 0
>         Percent repaired: 79.76
>         Bloom filter false positives: 0
>         Bloom filter false ratio: 0.00000
>         Bloom filter space used: 690912
>         Bloom filter off heap memory used: 690760
>         Index summary off heap memory used: 54736
>         Compression metadata off heap memory used: 1880
>         Compacted partition minimum bytes: 73
>         Compacted partition maximum bytes: 124
>         Compacted partition mean bytes: 96
>         Average live cells per slice (last five minutes): NaN
>         Maximum live cells per slice (last five minutes): 0
>         Average tombstones per slice (last five minutes): NaN
>         Maximum tombstones per slice (last five minutes): 0
>         Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67168-big-Data.db'),
>  
> BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67167-big-Data.db'),
>  
> BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67166-big-Data.db')]]
> {code}
> Also for every LCS table we're seeing the following warning being spammed 
> (seems to be in line with anticompaction spam):
> {code:java}
> Apr 26 21:30:09 cassandra[9263]: WARN  o.a.c.d.c.LeveledCompactionStrategy 
> Live sstable 
> /var/lib/cassandra/data/xxx/xxx-8c3ef9e0e3fc11e6868a8fd39a64fd59/mc-79024-big-Data.db
>  from level 0 is not on corresponding level in the leveled manifest. This is 
> not a problem per se, but may indicate an orphaned sstable due to a failed 
> compaction not cleaned up properly.{code}
> This is a vnodes cluster with 256 tokens per node, and the only thing that 
> seems like it could be causing issues is anticompactions.
> CASSANDRA-14079 might be related but doesn't quite describe the same issue, 
> and in this case we're using only a single disk for data. Have yet to 
> reproduce but figured worth reporting here first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to