Everything is ttl’d I suppose I could use sstablemeta to see the repaired bit, could I just set that to unrepaired somehow and that would fix?
Thanks! > On Aug 7, 2018, at 8:12 PM, Jeff Jirsa <jji...@gmail.com> wrote: > > May be worth seeing if any of the sstables got promoted to repaired - if so > they’re not eligible for compaction with unrepaired sstables and that could > explain some higher counts > > Do you actually do deletes or is everything ttl’d? > > > -- > Jeff Jirsa > > >> On Aug 7, 2018, at 5:09 PM, Brian Spindler <brian.spind...@gmail.com> wrote: >> >> Hi Jeff, mostly lots of little files, like there will be 4-5 that are >> 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each. >> >> Re incremental repair; Yes one of my engineers started an incremental repair >> on this column family that we had to abort. In fact, the node that the >> repair was initiated on ran out of disk space and we ended replacing that >> node like a dead node. >> >> Oddly the new node is experiencing this issue as well. >> >> -B >> >> >>> On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa <jji...@gmail.com> wrote: >>> You could toggle off the tombstone compaction to see if that helps, but >>> that should be lower priority than normal compactions >>> >>> Are the lots-of-little-files from memtable flushes or repair/anticompaction? >>> >>> Do you do normal deletes? Did you try to run Incremental repair? >>> >>> -- >>> Jeff Jirsa >>> >>> >>>> On Aug 7, 2018, at 5:00 PM, Brian Spindler <brian.spind...@gmail.com> >>>> wrote: >>>> >>>> Hi Jonathan, both I believe. >>>> >>>> The window size is 1 day, full settings: >>>> AND compaction = {'timestamp_resolution': 'MILLISECONDS', >>>> 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', >>>> 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': >>>> '86400', 'tombstone_threshold': '0.2', 'class': >>>> 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'} >>>> >>>> >>>> nodetool tpstats >>>> >>>> Pool Name Active Pending Completed Blocked >>>> All time blocked >>>> MutationStage 0 0 68582241832 0 >>>> 0 >>>> ReadStage 0 0 209566303 0 >>>> 0 >>>> RequestResponseStage 0 0 44680860850 0 >>>> 0 >>>> ReadRepairStage 0 0 24562722 0 >>>> 0 >>>> CounterMutationStage 0 0 0 0 >>>> 0 >>>> MiscStage 0 0 0 0 >>>> 0 >>>> HintedHandoff 1 1 203 0 >>>> 0 >>>> GossipStage 0 0 8471784 0 >>>> 0 >>>> CacheCleanupExecutor 0 0 122 0 >>>> 0 >>>> InternalResponseStage 0 0 552125 0 >>>> 0 >>>> CommitLogArchiver 0 0 0 0 >>>> 0 >>>> CompactionExecutor 8 42 1433715 0 >>>> 0 >>>> ValidationExecutor 0 0 2521 0 >>>> 0 >>>> MigrationStage 0 0 527549 0 >>>> 0 >>>> AntiEntropyStage 0 0 7697 0 >>>> 0 >>>> PendingRangeCalculator 0 0 17 0 >>>> 0 >>>> Sampler 0 0 0 0 >>>> 0 >>>> MemtableFlushWriter 0 0 116966 0 >>>> 0 >>>> MemtablePostFlush 0 0 209103 0 >>>> 0 >>>> MemtableReclaimMemory 0 0 116966 0 >>>> 0 >>>> Native-Transport-Requests 1 0 1715937778 0 >>>> 176262 >>>> >>>> Message type Dropped >>>> READ 2 >>>> RANGE_SLICE 0 >>>> _TRACE 0 >>>> MUTATION 4390 >>>> COUNTER_MUTATION 0 >>>> BINARY 0 >>>> REQUEST_RESPONSE 1882 >>>> PAGED_RANGE 0 >>>> READ_REPAIR 0 >>>> >>>> >>>>> On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <j...@jonhaddad.com> wrote: >>>>> What's your window size? >>>>> >>>>> When you say backed up, how are you measuring that? Are there pending >>>>> tasks or do you just see more files than you expect? >>>>> >>>>>> On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler <brian.spind...@gmail.com> >>>>>> wrote: >>>>>> Hey guys, quick question: >>>>>> >>>>>> I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log on >>>>>> one drive, data on nvme. That was working very well, it's a ts db and >>>>>> has been accumulating data for about 4weeks. >>>>>> >>>>>> The nodes have increased in load and compaction seems to be falling >>>>>> behind. I used to get about 1 file per day for this column family, >>>>>> about ~30GB Data.db file per day. I am now getting hundreds per day at >>>>>> 1mb - 50mb. >>>>>> >>>>>> How to recover from this? >>>>>> >>>>>> I can scale out to give some breathing room but will it go back and >>>>>> compact the old days into nicely packed files for the day? >>>>>> >>>>>> I tried setting compaction throughput to 1000 from 256 and it seemed to >>>>>> make things worse for the CPU, it's configured on i3.2xl with 8 >>>>>> compaction threads. >>>>>> >>>>>> -B >>>>>> >>>>>> Lastly, I have mixed TTLs in this CF and need to run a repair (I think) >>>>>> to get rid of old tombstones, however running repairs in 2.1 on TWCS >>>>>> column families causes a very large spike in sstable counts due to >>>>>> anti-compaction which causes a lot of disruption, is there any other >>>>>> way? >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Jon Haddad >>>>> http://www.rustyrazorblade.com >>>>> twitter: rustyrazorblade