Re: Ignite Node failure - Node out of topology (SEGMENTED)

2020-04-14 Thread VeenaMithare
Hi Evgenii, Thank you for the reply and suggestion. regards, Veena. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2020-04-14 Thread Evgenii Zhuravlev
Hi, Segmentation plugin won't help with the issue itself. If you have a long GC pause, it means that node is unresponsive for all this time. If you have GC pause longer than 10 seconds, node will be dropped from the cluster(by default). If you have long GC pauses, probably your load too big for

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2020-04-14 Thread VeenaMithare
Hi Dmitry, Would having a segmentation plugin help to resolve segmentation due to GC pauses ? Or is the best resolution for long GC pauses is to resolve it and get the GC pauses to be within the failure detection timeout ? regards, Veena. -- Sent from:

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-08-27 Thread luqmanahmad
See [1] for free network segmentation plugin [1] https://github.com/luqmanahmad/ignite-plugins -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-27 Thread dkarachentsev
Naresh, GC logs show not only GC pause, but system pause as well. Try these parameters: -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime Thanks! -Dmitry -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-26 Thread naresh.goty
Thanks for the recommendation, but we already identified and addressed the issues with GC pauses in JVM, and now we could not find any long gc activity during the time of node failure due to network segmentation. (please find the attached screenshot of GC activity from dynatrace agent). >From the

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-26 Thread dkarachentsev
Hi Naresh, Actually any JVM process hang could lead to segmentation. If some node is not responsive for longer than failureDetectionTimeout, it will be kicked off from the cluster to prevent all over grid performance degradation. It works on following scenario. Let's say we have 3 nodes in a

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-25 Thread naresh.goty
Hi Dmitry, We are again seeing segmentation failure in one of the node of our prod env. This time we did not run jmap, but still node failed. -> CPU, memory utilization and network are in optimal state. We observed that there are page faults in memory at the same time of segmentation failure,

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-18 Thread naresh.goty
Thanks Dmitry -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-13 Thread dkarachentsev
Hi Naresh, Recommendation will be the same: increase failureDetectionTimeout unless nodes stop segmenting or use gdb (or remove "live" option from jmap command to skip full GC). Thanks! -Dmitry -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-13 Thread naresh.goty
thanks Stan. We have enabled actionable alerts to generate dumps when memory utilization reaches a certain threshold on all cache nodes. Whenever the alert is triggered, cache node is getting segmented. So, essentially we cannot take dumps on a live node. Even increasing socket timeout may not

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-11 Thread Stanislav Lukyanov
Node failure - Node out of topology (SEGMENTED) Hi All, We found that, when jmap utility is triggered to generate heapdumps in the application node, NODE_SEGMENTATION event is fired from the node. Can some one please let us know how to safely take heapdumps in a live node with ignite cache running

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-11 Thread naresh.goty
Hi All, We found that, when jmap utility is triggered to generate heapdumps in the application node, NODE_SEGMENTATION event is fired from the node. Can some one please let us know how to safely take heapdumps in a live node with ignite cache running in embedded node without crashing the node due

RE: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-11 Thread Stanislav Lukyanov
and getting any CPU time. Stan From: naresh.goty Sent: 9 июня 2018 г. 18:16 To: user@ignite.apache.org Subject: Re: Ignite Node failure - Node out of topology (SEGMENTED) We are still seeing the NODE SEGMENTATION issue happening to one of the node in our production even after JVM option

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-09 Thread naresh.goty
We are still seeing the NODE SEGMENTATION issue happening to one of the node in our production even after JVM option is enabled ( -Djava.net.preferIPv4Stack=true). We don't see any activity reported in the logs for a period of ~30min after node failed. The below logs are from the failed node, and

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-07 Thread Andrey Mashenkov
Hi, Seems, there is a bug with IPv6 usage [1]. This has to be investigated. Also, there is a discussion [2]. [1] https://issues.apache.org/jira/browse/IGNITE-6503 [2] http://apache-ignite-developers.2346864.n4.nabble.com/Issues-if-Djava-net-preferIPv4Stack-true-is-not-set-td22372.html On Wed,

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-06-06 Thread naresh.goty
Thanks. We have enabled IPV4 JVM option in our non-production environment, found no issue reported on segmentation. Our main concern is, the issue is happening only in production, and we are very much interested in finding the real root cause (we can rule out - GC pauses, CPU spikes, network

Re: Ignite Node failure - Node out of topology (SEGMENTED)

2018-04-27 Thread Andrey Mashenkov
Hi, Try to disable IPv6 on all nodes via JVM option -Djava.net.preferIPv4Stack=true [1] as using both IPv4 and IPv6 can cause grid segmentation. [1] https://stackoverflow.com/questions/11850655/how-can-i-disable-ipv6-stack-use-for-ipv4-ips-on-jre On Fri, Apr 27, 2018 at 8:52 AM, naresh.goty