Hm, this MAY somehow relate to the issue I encountered recently: https://issues.apache.org/jira/browse/CASSANDRA-12730 I also made a proposal to mitigate excessive (unnecessary) flushes during repair streams but unfortunately nobody commented on it yet. Maybe there are some opinions on it around here?
2016-11-07 20:15 GMT+00:00 Ben Slater <ben.sla...@instaclustr.com>: > What I’ve seen happen a number of times is you get in a negative feedback > loop: > not enough capacity to keep up with compactions (often triggered by repair > or compaction hitting a large partition) -> more sstables -> more expensive > reads -> even less capacity to keep up with compactions -> repeat > > The way we deal with this at Instaclustr is typically to take the node > offline to let it catch up with compactions. We take it offline by running > nodetool disablegossip + disablethrift + disablebinary, unthrottle > compactions (nodetool setcompactionthroughput 0) and then leave it to chug > through compactions until it gets close to zero then reverse the settings > or restart C* to set things back to normal. This typically resolves the > issues. If you see it happening regularly your cluster probably needs more > processing capacity (or other tuning). > > Cheers > Ben > > On Tue, 8 Nov 2016 at 02:38 Eiti Kimura <eiti.kim...@movile.com> wrote: > >> Hey guys, >> >> Do we have any conclusions about this case? Ezra, did you solve your >> problem? >> We are facing a very similar problem here. LeveledCompaction with VNodes >> and looks like a node went to a weird state and start to consume lot of >> CPU, the compaction process seems to be stucked and the number of SSTables >> increased significantly. >> >> Do you have any clue about it? >> >> Thanks, >> Eiti >> >> >> >> J.P. Eiti Kimura >> Plataformas >> >> +55 19 3518 <https://www.movile.com/assinaturaemail/#>5500 >> + <https://www.movile.com/assinaturaemail/#>55 19 98232 2792 >> skype: eitikimura >> <https://www.linkedin.com/company/movile> >> <https://pt.pinterest.com/Movile/> <https://twitter.com/movile_LATAM> >> <https://www.facebook.com/Movile> >> >> 2016-09-11 18:20 GMT-03:00 Jens Rantil <jens.ran...@tink.se>: >> >> I just want to chime in and say that we also had issues keeping up with >> compaction once (with vnodes/ssd disks) and I also want to recommend >> keeping track of your open file limit which might bite you. >> >> Cheers, >> Jens >> >> >> On Friday, August 19, 2016, Mark Rose <markr...@markrose.ca> wrote: >> >> Hi Ezra, >> >> Are you making frequent changes to your rows (including TTL'ed >> values), or mostly inserting new ones? If you're only inserting new >> data, it's probable using size-tiered compaction would work better for >> you. If you are TTL'ing whole rows, consider date-tiered. >> >> If leveled compaction is still the best strategy, one way to catch up >> with compactions is to have less data per partition -- in other words, >> use more machines. Leveled compaction is CPU expensive. You are CPU >> bottlenecked currently, or from the other perspective, you have too >> much data per node for leveled compaction. >> >> At this point, compaction is so far behind that you'll likely be >> getting high latency if you're reading old rows (since dozens to >> hundreds of uncompacted sstables will likely need to be checked for >> matching rows). You may be better off with size tiered compaction, >> even if it will mean always reading several sstables per read (higher >> latency than when leveled can keep up). >> >> How much data do you have per node? Do you update/insert to/delete >> rows? Do you TTL? >> >> Cheers, >> Mark >> >> On Wed, Aug 17, 2016 at 2:39 PM, Ezra Stuetzel <ezra.stuet...@riskiq.net> >> wrote: >> > I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping to >> fix >> > issue) which seems to be stuck in a weird state -- with a large number >> of >> > pending compactions and sstables. The node is compacting about >> 500gb/day, >> > number of pending compactions is going up at about 50/day. It is at >> about >> > 2300 pending compactions now. I have tried increasing number of >> compaction >> > threads and the compaction throughput, which doesn't seem to help >> eliminate >> > the many pending compactions. >> > >> > I have tried running 'nodetool cleanup' and 'nodetool compact'. The >> latter >> > has fixed the issue in the past, but most recently I was getting OOM >> errors, >> > probably due to the large number of sstables. I upgraded to 2.2.7 and >> am no >> > longer getting OOM errors, but also it does not resolve the issue. I do >> see >> > this message in the logs: >> > >> >> INFO [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985 >> >> CompactionManager.java:610 - Cannot perform a full major compaction as >> >> repaired and unrepaired sstables cannot be compacted together. These >> two set >> >> of sstables will be compacted separately. >> > >> > Below are the 'nodetool tablestats' comparing a normal and the >> problematic >> > node. You can see problematic node has many many more sstables, and >> they are >> > all in level 1. What is the best way to fix this? Can I just delete >> those >> > sstables somehow then run a repair? >> >> >> >> Normal node >> >>> >> >>> keyspace: mykeyspace >> >>> >> >>> Read Count: 0 >> >>> >> >>> Read Latency: NaN ms. >> >>> >> >>> Write Count: 31905656 >> >>> >> >>> Write Latency: 0.051713177939359714 ms. >> >>> >> >>> Pending Flushes: 0 >> >>> >> >>> Table: mytable >> >>> >> >>> SSTable count: 1908 >> >>> >> >>> SSTables in each level: [11/4, 20/10, 213/100, 1356/1000, >> 306, 0, >> >>> 0, 0, 0] >> >>> >> >>> Space used (live): 301894591442 >> >>> >> >>> Space used (total): 301894591442 >> >>> >> >>> >> >>> >> >>> Problematic node >> >>> >> >>> Keyspace: mykeyspace >> >>> >> >>> Read Count: 0 >> >>> >> >>> Read Latency: NaN ms. >> >>> >> >>> Write Count: 30520190 >> >>> >> >>> Write Latency: 0.05171286705620116 ms. >> >>> >> >>> Pending Flushes: 0 >> >>> >> >>> Table: mytable >> >>> >> >>> SSTable count: 14105 >> >>> >> >>> SSTables in each level: [13039/4, 21/10, 206/100, 831, 0, 0, >> 0, >> >>> 0, 0] >> >>> >> >>> Space used (live): 561143255289 >> >>> >> >>> Space used (total): 561143255289 >> > >> > Thanks, >> > >> > Ezra >> >> >> >> -- >> Jens Rantil >> Backend engineer >> Tink AB >> >> Email: jens.ran...@tink.se >> Phone: +46 708 84 18 32 >> Web: www.tink.se >> >> Facebook <https://www.facebook.com/#!/tink.se> Linkedin >> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> >> Twitter <https://twitter.com/tink> >> >> >> -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com Wehrstraße 46 · 73035 Göppingen · Germany Phone +49 7161 304880-6 · Fax +49 7161 304880-1 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer