hello Igniters, we are seeing frequent disconnection between ignite instances, we have IP based clusters which has following configuration -
Ignite version - 1.7.0 <property name="discoverySpi"> <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"> <property name="ipFinder"> <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"> <property name="addresses"> <list> <value>HOST_IP1:47500..47509</value> <value>HOST_IP2:47500..47509</value> </list> </property> </bean> </property> </bean> </property> See the error log - [09:53:46,139][WARN ][tcp-disco-msg-worker-#2%WebGrid%][TcpDiscoverySpi] Local node has detected failed nodes and started cluster-wide procedure. To speed up failure detection please see 'Failure Detection' section under javadoc for 'TcpDiscoverySpi' [09:54:56,060][WARN ][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=22132, minorTopVer=0], node=d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2]. Dumping pending objects that might be the cause: [09:54:56,060][WARN ][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Ready affinity version: AffinityTopologyVersion [topVer=22131, minorTopVer=0] [09:54:56,062][WARN ][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Last exchange future: GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=d2ffb86c-5305-4cb3-96a0-874be73d610a, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, host_ip2], sockAddrs=[host2/host_ip2:47501, 0:0:0:0:0:0:0:1%lo:47501, /127.0.0.1:47501], discPort=47501, order=22131, intOrder=11068, lastExchangeTime=1499352867440, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], topVer=22132, nodeId8=d3719fe1, msg=Node left: TcpDiscoveryNode [id=d2ffb86c-5305-4cb3-96a0-874be73d610a, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, host_ip2], sockAddrs=[host2/host_ip2:47501, 0:0:0:0:0:0:0:1%lo:47501, /127.0.0.1:47501], discPort=47501, order=22131, intOrder=11068, lastExchangeTime=1499352867440, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], type=NODE_LEFT, tstamp=1499352886042], crd=TcpDiscoveryNode [id=64ce302c-9743-47bc-bf27-641015a37b81, addrs=[127.0.0.1, host_ip1], sockAddrs=[/127.0.0.1:47500, host1/host_ip1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1498849915139, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=22132, minorTopVer=0], nodeId=d2ffb86c, evt=NODE_LEFT], added=true, initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1499352886042, endTime=1499352886052, ignoreInterrupts=false, state=DONE],init=true, topSnapshot=null, lastVer=null, partReleaseFut=GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=3, done=true, cancelled=false, err=null, futs=[true, true, true]], affChangeMsg=null, skipPreload=false, clientOnlyExchange=false, initTs=1499352886042, centralizedAff=true, evtLatch=0, remaining=[64ce302c-9743-47bc-bf27-641015a37b81], srvNodes=[TcpDiscoveryNode [id=64ce302c-9743-47b c-bf27-641015a37b81, addrs=[127.0.0.1, host_ip1], sockAddrs=[/127.0.0.1:47500, host1/host_ip1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1498849915139, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], TcpDiscoveryNode [id=d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2, addrs=[127.0.0.1, host_ip2], sockAddrs=[/127.0.0.1:47500, host2/host_ip2:47500], discPort=47500, order=4, intOrder=3, lastExchangeTime=1499352895809, loc=true, ver=1.7.0#20160801-sha1:383273e3, isClient=false]], super=GridFutureAdapter [resFlag=0, res=nul l, startTime=1499352886042, endTime=0, ignoreInterrupts=false, state=INIT]] [10:08:37,232][WARN ][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=22134, minorTopVer=0], node= d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2]. Dumping pending objects that might be the cause: [10:08:47,287][WARN ][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=22134, minorTopVer=0], node= d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2]. Dumping pending objects that might be the cause: class org.apache.ignite.IgniteException: Failed to wait for affinity ready future for topology version: AffinityTopologyVersion [topVer=22134, minorTopVer=0] at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.awaitTopologyVersion(GridAffinityAssignmentCache.java:526) at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:434) at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.assignments(GridAffinityAssignmentCache.java:331) at org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.assignments(GridCacheAffinityManager.java:165) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.initPartitions0(GridDhtPartitionTopologyImpl.java:373) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.initPartitions(GridDhtPartitionTopologyImpl.java:340) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1057) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:86) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:324) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processMessage(GridDhtPartitionsExchangeFuture.java:1400) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$400(GridDhtPartitionsExchangeFuture.java:86) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$4.apply(GridDhtPartitionsExchangeFuture.java:1369) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$4.apply(GridDhtPartitionsExchangeFuture.java:1357) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263) at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:226) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceive(GridDhtPartitionsExchangeFuture.java:1357) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processFullPartitionUpdate(GridCachePartitionExchangeManager.java:1030) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1200(GridCachePartitionExchangeManager.java:112) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:316) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:314) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:1807) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:1789) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:748) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:353) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:277) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:88) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:231) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866) at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106) at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: class org.apache.ignite.IgniteCheckedException: Failed to wait for topology update, cache (or node) is stopping. at org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.cancelFutures(GridCacheAffinityManager.java:92) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:904) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1914) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1860) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2266) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2229) at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:323) at org.apache.ignite.Ignition.stop(Ignition.java:224) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$8.run(GridDiscoveryManager.java:1946) ... 1 more Can any one guide me what tuning are require on configuration ? I have also noticed that CPU and JVM memory gradually rising by days on Ignite servers. Thanks for all your help.. Rishi -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/frequet-disconnection-in-ignite-cluster-tp14411.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.