Are you tweaking the "nice" priority on Cassandra? (Type: man nice) if you don't know much about it. Certainly improving cassandra's nice score becomes important when you have other things running on the server like scheduled jobs of people logging in to the server and doing things.
______________________________ Sent from iPhone > On 19 Feb 2015, at 5:28 am, Michał Łowicki <mlowi...@gmail.com> wrote: > > Hi, > > Couple of times a day 2 out of 4 members cluster nodes are killed > > root@db4:~# dmesg | grep -i oom > [4811135.792657] [ pid ] uid tgid total_vm rss cpu oom_adj > oom_score_adj name > [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, > oom_adj=0, oom_score_adj=0 > > Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using > row cache. > > Noticed that couple of times a day used RSS is growing really fast within > couple of minutes and I see CPU spikes at the same time - > https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0. > > Could be related to compaction but after compaction is finished used RSS > doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) > is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At the time > dump was made heap usage is far below 8GB (~3GB) but total RSS is ~50GB. > > Any help will be appreciated. > > -- > BR, > Michał Łowicki