Re: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread Andrey Novikov
 "nc - Total number of nodes in the grid" count server + client nodes. I
can't find metrics for server nodes.

I tried to check the heap dump in "mat" and found a large size of
TcpCommunicationSpi#recoveryDescs. Does anyone have an idea why this
happened?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409p12446.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread tysli2016
Thanks Andrey, is there an option to monitor the number of server nodes in
the grid?

I found "nc - Total number of nodes in the grid.", seems counting server +
client nodes, correct?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409p12445.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread tysli2016
Thank Evgenii!

By running the `${IGNITE_HOME}/bin/ignitevisorcmd.sh -e="'open
-cpath=${IGNITE_HOME}/config/default-config.xml;node'"`, it shows "Ignite
node stopped OK" at the end. Is it an indicator of visor stopped properly?

We use the visor output to check the number of Ignite servers running, this
checking is trigger by a cron job + shell script, so it starts a new visor
each time.

How could a shell script use an already started visor?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409p12444.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread Andrey Novikov
Hi tysli2016,

You can run connected to cluster visorcmd in background and register an
alert. When this alert is triggered, it may call custom user script.

More info can be found here:
https://apacheignite-tools.readme.io/docs/alerts-configuration



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409p12417.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread Evgenii Zhuravlev
Hi,

As i see, you run visor in internal mode, so, it creates a node each time.
Are you sure that you stop them properly?

Why do you need to start new visor each time? Just use already started
visor.

Evgenii




2017-05-04 11:57 GMT+03:00 tysli2016 :

> Got "OutOfMemoryError: Java heap space" with 2-node cluster with a `visor`
> running repeatedly.
>
> The server nodes are running on CentOS 7 inside Oracle VirtualBox VM with
> the same config:
> - 2 vCPUs
> - 3.5GB memory
> - Oracle JDK 2.8.0_121
>
> `default-config.xml` was modified to use non-default multicast group and 1
> backup:
> http://www.springframework.org/schema/beans;
>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>xsi:schemaLocation="
>http://www.springframework.org/schema/beans
>http://www.springframework.org/schema/beans/spring-beans.xsd;>
>  class="org.apache.ignite.configuration.IgniteConfiguration">
> 
>  class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
> 
>  class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.
> TcpDiscoveryMulticastIpFinder">
>  value="228.10.10.158"/>
> 
> 
> 
> 
> 
>  class="org.apache.ignite.configuration.CacheConfiguration">
> 
> 
> 
> 
> 
>
>
> The `visor` was running repeatedly in one of the nodes by a shell script:
> #!/bin/bash
> IGNITE_HOME=/root/apache-ignite-fabric-1.9.0-bin
> while true
> do
>   ${IGNITE_HOME}/bin/ignitevisorcmd.sh -e="'open
> -cpath=${IGNITE_HOME}/config/default-config.xml;node'"
> done
>
>
> The OOME thrown after the above settings running for 1 day.
> I have put ignite log, gc log, heap dump in `dee657c8.tgz`, which could be
> downloaded from
> https://drive.google.com/drive/folders/0BwY2dxDlRYhBSFJhS0ZWOVBiNk0?
> usp=sharing.
> `507f0201.tgz` contains ignite log and gc log from another node in the
> cluster, for reference just in case.
>
> Running `visor` repeatedly is just to reproduce the OOME more quickly, in
> production we run the `visor` once per 10 minutes to monitor the
> healthiness
> of the cluster.
>
> Questions:
> 1. Anything wrong with the configuration? Anything can be tuned to avoid
> OOME?
> 2. Is there any other built-in tools allow one to monitor the cluster,
> showing no. of server nodes is good enough.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-
> running-repeatedly-Ignite-1-9-tp12409.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>


OOME on 2-node cluster with visor running repeatedly, Ignite 1.9

2017-05-04 Thread tysli2016
Got "OutOfMemoryError: Java heap space" with 2-node cluster with a `visor`
running repeatedly.

The server nodes are running on CentOS 7 inside Oracle VirtualBox VM with
the same config:
- 2 vCPUs
- 3.5GB memory
- Oracle JDK 2.8.0_121

`default-config.xml` was modified to use non-default multicast group and 1
backup:
http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xsi:schemaLocation="
   http://www.springframework.org/schema/beans
   http://www.springframework.org/schema/beans/spring-beans.xsd;>



















The `visor` was running repeatedly in one of the nodes by a shell script:
#!/bin/bash
IGNITE_HOME=/root/apache-ignite-fabric-1.9.0-bin
while true
do
  ${IGNITE_HOME}/bin/ignitevisorcmd.sh -e="'open
-cpath=${IGNITE_HOME}/config/default-config.xml;node'"
done


The OOME thrown after the above settings running for 1 day.
I have put ignite log, gc log, heap dump in `dee657c8.tgz`, which could be
downloaded from 
https://drive.google.com/drive/folders/0BwY2dxDlRYhBSFJhS0ZWOVBiNk0?usp=sharing.
`507f0201.tgz` contains ignite log and gc log from another node in the
cluster, for reference just in case.

Running `visor` repeatedly is just to reproduce the OOME more quickly, in
production we run the `visor` once per 10 minutes to monitor the healthiness
of the cluster.

Questions:
1. Anything wrong with the configuration? Anything can be tuned to avoid
OOME?
2. Is there any other built-in tools allow one to monitor the cluster,
showing no. of server nodes is good enough.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.