Hello, everyone,
We have a large cluster with the following info:Cassandra version:
3.11.6Multi-DC and 100 nodes per DC.
We recently have seen many nodes with hundreds of thousands tiny sstables
flushed to disk constantly. We can see the following messages in
debug.log:DEBUG [NativePoolCleaner] <timestamp> ColumnFamilyStore.java:932 -
Enqueuing flush of sstable_activity: 0.408KiB (0%) on-heap, 0.154KiB (0%)
off-heapDEBUG [NonPeriodicTasks:1] 2<timestamp> SSTable.java:105 - Deleting
sstable: <sstable name>
The node with large number of sstables would be under high CPU load and become
UNREACHABLE for many other live nodes.When restarting Cassandra on these nodes,
it would take a long long time, like 1 or 2 hours to DRAIN down or STARTING up,
and there are many of the above "Deleting sstable" message logged, which looks
like cleanup process to clear those tiny sstables.
Any idea or advice?
Thanks,Jiayong Sun