On Fri, Feb 19, 2010 at 7:40 PM, Santal Li <santal...@gmail.com> wrote: > I meet almost same thing as you. When I do some benchmarks write test, some > times one Cassandra will freeze and other node will consider it was shutdown > and up after 30+ second. I am using 5 node, each node 8G mem for java heap. > > From my investigate, it was caused by GC thread, because I start the > JConsole and monitor with the memory heap usage, each time when the GC > happend, heap usage will drop down from 6G to 1G, and check the casandra > log, I found the freeze happend at exactly same times.
With such a big heap, old generation GCs can definitely take a while. With just 1.5 gig heap, and with somewhat efficient parallel collection (on multi-core machine), we had trouble keeping collections below 5 seconds. But this depends a lot on survival ratio -- less garbage there is (and more live objects), slower things are. And relationship is super-linear too, so processing 6 gig (or whatever part of that is old generation space) can take a long time. It is certainly worth keeping in mind that more memory generally means longer gc collection time. But Jonathan is probably right in that this alone would not cause appearance of freeze -- rather, overload of GC blocking processing AND accumulation of new requests sounds more plausible. It is still good to consider both parts of the puzzle; preventing overflow that can turn bad situation into catastrophe, and trying to reduce impact of GC. > So I think when using huge memory(>2G), maybe need using some different GC > stratege other than the default one provide by Cassandra lunch script. > Dose't anyone meet this situation, can you please provide some guide? There are many ways to change GC settings, and specifically trying to reduce impact of old gen collections (young generation ones are less often problematic, although they can be tuned as well). Often there is a trade-off between frequency and impact of GC: to simplify, less often you configure it to occur (like increase heap), more impact it usually has when it does occur. Concurrent collectors (like traditional CMS) are good for steady state, and can keep oldgen GC from occuring maybe for hours (doing incremental concurrent "partial" collections). But can also lead to GC-from-hell when it must do full GC (since it's stop-the-world) kind. There is tons of information on how to deal with GC settings, but unfortunately it is bit of black arts and very dependant on your specific use case. There being dozens (more than a hundred I think) different switches makes it actually trickier, since you also need to learn which ones matter, and in what combinations. One somewhat counter-intuitive suggestion is to reduce size of heap at least with respect to caching. So mostly try to just keep live working set in memory, and not do caching inside Java process. Operating systems are pretty good at caching disk pages; and if storage engine is out of process (like native BDB), this can significantly reduce GC. In-process caches can be really bad for GC activity, because their contents are potentially long-living, yet relatively transient (that is, neither mostly live, nor mostly garbage, making GC optimizer try in vain to compact things). But once again, this may or may not help, and needs to be experimented with. Not sure if above helps, but I hope it gives at least some ideas, -+ Tatu +-