Can you check with nodetool tpstats if bloom filter mem space utilization is very large/ramping up before the node gets killed? You could be hitting CASSANDRA-11344.
2016-03-12 19:43 GMT-03:00 Mohamed Lrhazi <mohamed.lrh...@georgetown.edu>: > In my case, all nodes seem to be constantly logging messages like these: > > DEBUG [GossipStage:1] 2016-03-12 17:41:19,123 FailureDetector.java:456 - > Ignoring interval time of 2000928319 for /10.212.18.170 > > What does that mean? > > Thanks a lot, > Mohamed. > > > On Sat, Mar 12, 2016 at 5:39 PM, Mohamed Lrhazi < > mohamed.lrh...@georgetown.edu> wrote: > >> Oh wow, similar behavior with different version all together!! >> >> On Sat, Mar 12, 2016 at 5:28 PM, ssiv...@gmail.com <ssiv...@gmail.com> >> wrote: >> >>> Hi, I'll duplicate here my email with the same issue >>> >>> " >>> >>> >>> *I have 7 nodes of C* v2.2.5 running on CentOS 7 and using jemalloc for >>> dynamic storage allocation. Use only one keyspace and one table with >>> Leveled compaction strategy. I've loaded ~500 GB of data into the cluster >>> with replication factor equals to 3 and waiting until compaction is >>> finished. But during compaction each of the C* nodes allocates all the >>> available memory (~128GB) and just stops its process. This is a known bug ? >>> *" >>> >>> >>> On 03/13/2016 12:56 AM, Mohamed Lrhazi wrote: >>> >>> Hello, >>> >>> We installed Datastax community edition, on 8 nodes, RHEL7. We inserted >>> some 7 billion rows into a pretty simple table. the inserts seem to have >>> completed without issues. but ever since, we find that the nodes reliably >>> run out of RAM after few hours, without any user activity at all. No reads >>> nor write are sent at all. What should we look for to try and identify >>> root cause? >>> >>> >>> [root@avesterra-prod-1 ~]# cat /etc/redhat-release >>> Red Hat Enterprise Linux Server release 7.2 (Maipo) >>> [root@avesterra-prod-1 ~]# rpm -qa| grep datastax >>> datastax-ddc-3.2.1-1.noarch >>> datastax-ddc-tools-3.2.1-1.noarch >>> [root@avesterra-prod-1 ~]# >>> >>> The nodes had 8 GB RAM, which we doubled twice and now are trying with >>> 40GB... they still manage to consume it all and cause oom_killer to kick in. >>> >>> Pretty much all the settings are the default ones the installation >>> created. >>> >>> Thanks, >>> Mohamed. >>> >>> >>> -- >>> Thanks, >>> Serj >>> >>> >> >