Hi Brian, Happy to know that problem was (temporary?) solved.
We're migrating from i2.xl (32GB ram, Local SSD) to m4.xl (16gb, gp2) so we > have a mix there, Cassandra JVM set to 10GB To prevent these unpredictable mixes of hardware I use to update hardware by adding a new data center, switching clients to it, then I remove the old data center. Both kind of machines are good fit in most cases and should be able to work together nicely though. Given the Memory / JVM sizes, using CMS rather than G1GC seems to be a good idea (I think this is the default in Cassandra 2.1). When I did a truncate, Cassandra did create a snapshot which I'm hoping to > copy over to a developer's machine and find the offending row(s). If it is > just huge rows, that's probably more of an application leak. Sounds like a wise approach, if you can know what happened, you might be able to prevent it from happening again. Is 'Compacted partition maximum bytes:' from cfstats the right thing to > look at? Yes definitely. It is commonly said that over 100 MB for a partition it starts being too much. In any case you don't want to have GB there, this would create a lot of pressure (GC, compactions, anti-entropy systems...). I'll look at cfstats in more detail, we've got some charting from JVM > metrics yeah. If you have charts of the GC, you can use as a baseline that it's good for GC pauses to be somewhere around a 5 - 15 % of the time spent for GC (150 ms of GC per sec). This range is quite arbitrary (and nothing "bad" happen when you reach 16%), but It is to give you an idea. Hope you'll find out what's wrong there, C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-10-06 18:02 GMT+01:00 Brian Spindler <brian.spind...@gmail.com>: > Hi Alain, thanks for getting back to me. I will read through those > articles. > > The truncate did solve the problem. > I am using Cassandra 2.1.15 > I'll look at cfstats in more detail, we've got some charting from JVM > metrics yeah. > We're migrating from i2.xl (32GB ram, Local SSD) to m4.xl (16gb, gp2) so > we have a mix there, Cassandra JVM set to 10GB > > When I did a truncate, Cassandra did create a snapshot which I'm hoping to > copy over to a developer's machine and find the offending row(s). If it is > just huge rows, that's probably more of an application leak. > > Is 'Compacted partition maximum bytes:' from cfstats the right thing to > look at? > > Thanks again, > -B > > On Fri, Oct 6, 2017 at 10:40 AM Alain RODRIGUEZ <arodr...@gmail.com> > wrote: > >> Hello Brian. >> >> Sorry to hear, looks like a lot of troubles. >> >> I think we should review this column family design so it doesn't generate >>> so many tombstones? Could that be the cause? >> >> >> It could be indeed, did truncating solved the issue? >> >> There so nicer approaches you can try to handle tombstones correctly >> depending on your use case. I wrote a post and presented a talk about this >> last year, I hope you'll find what you are looking for. >> >> http://thelastpickle.com/blog/2016/07/27/about-deletes-and- >> tombstones.html >> https://www.youtube.com/watch?v=lReTEcnzl7Y >> >> What else would you recommend? >> >> >> Well we don't have much information to guess. But I will try to give you >> relevant clues with what you gave us so far: >> >> that one column family had some large/corrupt data and causing OOM's >>> >> >> Are you using Cassandra 3.0.x (x < 14)? You might be facing a bug in >> Cassandra corrupting data after schema changes ( >> https://issues.apache.org/jira/browse/CASSANDRA-13004). >> >> You can check large partition using 'nodetool cfstats' or using >> monitoring and corresponding metric (per table / columnfamily) >> >> Other than that what is the memory available, the heap size and GC type >> and options in use. Do you see some GC pauses in the logs or do you control >> this value through a chart using JVM metrics? >> >> C*heers, >> >> ----------------------- >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> France / Spain >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> >> 2017-10-06 14:48 GMT+01:00 Brian Spindler <brian.spind...@gmail.com>: >> >>> Sorry about that. We eventually found that one column family had some >>> large/corrupt data and causing OOM's >>> >>> Luckily it was a pretty ephemeral data set and we were able to just >>> truncate it. However, it was a guess based on some log messages about >>> reading a large number of tombstones on that column families. I think we >>> should review this column family design so it doesn't generate so many >>> tombstones? Could that be the cause? What else would you recommend? >>> >>> Thank you in advance. >>> >>> On Fri, Oct 6, 2017 at 6:33 AM Brian Spindler <brian.spind...@gmail.com> >>> wrote: >>> >>>> Hi guys, our cluster - around 18 nodes - just starting having nodes die >>>> and when restarting them they are dying with OOM. How can we handle this? >>>> I've tried adding a couple extra gigs on these machines to help but it's >>>> not. >>>> >>>> Help! >>>> -B >>>> >>>> >>