OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

tysli2016 Thu, 04 May 2017 01:58:28 -0700

Got "OutOfMemoryError: Java heap space" with 2-node cluster with a `visor`
running repeatedly.


The server nodes are running on CentOS 7 inside Oracle VirtualBox VM with
the same config:
- 2 vCPUs
- 3.5GB memory
- Oracle JDK 2.8.0_121

`default-config.xml` was modified to use non-default multicast group and 1
backup:
    <beans xmlns="http://www.springframework.org/schema/beans";
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
           xsi:schemaLocation="
           http://www.springframework.org/schema/beans
           http://www.springframework.org/schema/beans/spring-beans.xsd";>
        <bean id="grid.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">
            <property name="discoverySpi">
                <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                    <property name="ipFinder">
                        <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                            <property name="multicastGroup"
value="228.10.10.158"/>
                        </bean>
                    </property>
                </bean>
            </property>
            <property name="cacheConfiguration">
                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="backups" value="1"/>
                </bean>
            </property>
        </bean>
    </beans>


The `visor` was running repeatedly in one of the nodes by a shell script:
    #!/bin/bash
    IGNITE_HOME=/root/apache-ignite-fabric-1.9.0-bin
    while true
    do
      ${IGNITE_HOME}/bin/ignitevisorcmd.sh -e="'open
-cpath=${IGNITE_HOME}/config/default-config.xml;node'"
    done


The OOME thrown after the above settings running for 1 day.
I have put ignite log, gc log, heap dump in `dee657c8.tgz`, which could be
downloaded from 
https://drive.google.com/drive/folders/0BwY2dxDlRYhBSFJhS0ZWOVBiNk0?usp=sharing.
`507f0201.tgz` contains ignite log and gc log from another node in the
cluster, for reference just in case.

Running `visor` repeatedly is just to reproduce the OOME more quickly, in
production we run the `visor` once per 10 minutes to monitor the healthiness
of the cluster.

Questions:
1. Anything wrong with the configuration? Anything can be tuned to avoid
OOME?
2. Is there any other built-in tools allow one to monitor the cluster,
showing no. of server nodes is good enough.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

Reply via email to