You could toggle off the tombstone compaction to see if that helps, but that should be lower priority than normal compactions
Are the lots-of-little-files from memtable flushes or repair/anticompaction? Do you do normal deletes? Did you try to run Incremental repair? -- Jeff Jirsa > On Aug 7, 2018, at 5:00 PM, Brian Spindler <brian.spind...@gmail.com> wrote: > > Hi Jonathan, both I believe. > > The window size is 1 day, full settings: > AND compaction = {'timestamp_resolution': 'MILLISECONDS', > 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', > 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '86400', > 'tombstone_threshold': '0.2', 'class': > 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'} > > > nodetool tpstats > > Pool Name Active Pending Completed Blocked All > time blocked > MutationStage 0 0 68582241832 0 > 0 > ReadStage 0 0 209566303 0 > 0 > RequestResponseStage 0 0 44680860850 0 > 0 > ReadRepairStage 0 0 24562722 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > MiscStage 0 0 0 0 > 0 > HintedHandoff 1 1 203 0 > 0 > GossipStage 0 0 8471784 0 > 0 > CacheCleanupExecutor 0 0 122 0 > 0 > InternalResponseStage 0 0 552125 0 > 0 > CommitLogArchiver 0 0 0 0 > 0 > CompactionExecutor 8 42 1433715 0 > 0 > ValidationExecutor 0 0 2521 0 > 0 > MigrationStage 0 0 527549 0 > 0 > AntiEntropyStage 0 0 7697 0 > 0 > PendingRangeCalculator 0 0 17 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0 116966 0 > 0 > MemtablePostFlush 0 0 209103 0 > 0 > MemtableReclaimMemory 0 0 116966 0 > 0 > Native-Transport-Requests 1 0 1715937778 0 > 176262 > > Message type Dropped > READ 2 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 4390 > COUNTER_MUTATION 0 > BINARY 0 > REQUEST_RESPONSE 1882 > PAGED_RANGE 0 > READ_REPAIR 0 > > >> On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <j...@jonhaddad.com> wrote: >> What's your window size? >> >> When you say backed up, how are you measuring that? Are there pending tasks >> or do you just see more files than you expect? >> >>> On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler <brian.spind...@gmail.com> >>> wrote: >>> Hey guys, quick question: >>> >>> I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log on >>> one drive, data on nvme. That was working very well, it's a ts db and has >>> been accumulating data for about 4weeks. >>> >>> The nodes have increased in load and compaction seems to be falling behind. >>> I used to get about 1 file per day for this column family, about ~30GB >>> Data.db file per day. I am now getting hundreds per day at 1mb - 50mb. >>> >>> How to recover from this? >>> >>> I can scale out to give some breathing room but will it go back and compact >>> the old days into nicely packed files for the day? >>> >>> I tried setting compaction throughput to 1000 from 256 and it seemed to >>> make things worse for the CPU, it's configured on i3.2xl with 8 compaction >>> threads. >>> >>> -B >>> >>> Lastly, I have mixed TTLs in this CF and need to run a repair (I think) to >>> get rid of old tombstones, however running repairs in 2.1 on TWCS column >>> families causes a very large spike in sstable counts due to anti-compaction >>> which causes a lot of disruption, is there any other way? >>> >>> >> >> >> -- >> Jon Haddad >> http://www.rustyrazorblade.com >> twitter: rustyrazorblade