Hello! In our project we are currently using ignite 2.81 and using zookeeper. During the last couple of days we were facing shutdowns of some of our ignite-server nodes.
Please find the logs below: 1) Why can there occur such long jvm/gc pauses although previous metrics in the log do not indicate that imho? 2) We have the following timeouts set for the server-nodes. Which of them would influence the handling after such long gc-pauses in order to avoid a restart of the node? Thanks in advance for your help! Configs: <bean class="org.apache.ignite.configuration.IgniteConfiguration"> <property name="peerClassLoadingEnabled" value="true" /> <property name="failureDetectionTimeout" value="600000" /> <property name="systemWorkerBlockedTimeout" value="600000" /> <property name="discoverySpi"> <bean class="org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi"> <property name="zkConnectionString" value="${ZOOKEEPER_CONNECT}"/> <property name="sessionTimeout" value="30000"/> <property name="zkRootPath" value="/apacheIgnite"/> <property name="joinTimeout" value="10000"/> </bean> </property> <property name="communicationSpi"> <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi"> <property name="socketWriteTimeout" value="30000" /> </bean> </property> .... LOGs: [12:46:21,142][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:47:21,146][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 20:56:18.016] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=20318MB, free=44.88%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:47:21,146][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:48:21,154][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 20:57:18.025] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=13057MB, free=64.58%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:48:21,154][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:49:21,162][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 20:58:18.029] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=8768MB, free=76.21%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=14, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:49:21,162][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:50:21,163][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 20:59:18.031] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0.03%] ^-- PageMemory [pages=16626106] ^-- Heap [used=7632MB, free=79.3%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:50:21,163][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:51:21,168][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 21:00:18.038] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=27712MB, free=24.82%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:51:21,168][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:52:21,174][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 21:01:18.045] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=27118MB, free=26.44%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:52:21,174][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:53:21,183][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 21:02:18.048] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=20510MB, free=44.36%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:53:21,183][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:54:21,186][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 21:03:18.055] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=14928MB, free=59.51%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:54:21,186][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:54:43,809][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible too long JVM pause: 1042 milliseconds. [12:55:06,263][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible too long JVM pause: 22404 milliseconds. [12:55:07,081][INFO][zk-null-EventThread][ZookeeperClient] ZooKeeper client state changed [prevState=Connected, newState=Disconnected] [12:55:07,631][SEVERE][grid-nio-worker-tcp-comm-1-#37][TcpCommunicationSpi] Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, bytesRcvd=1070545017136, bytesSent=76240610573, bytesRcvd0=864051, bytesSent0=19236, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, heartbeatTs=1604062506627, hashCode=1206603371, interrupted=false, runner=grid-nio-worker-tcp-comm-1-#37]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=120544, resendCnt=0, rcvCnt=115641, sentCnt=120546, reserved=true, lastAck=115616, nodeLeft=false, node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], connected=false, connectCnt=198, queueLimit=4096, reserveCnt=253, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=120544, resendCnt=0, rcvCnt=115641, sentCnt=120546, reserved=true, lastAck=115616, nodeLeft=false, node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], connected=false, connectCnt=198, queueLimit=4096, reserveCnt=253, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/10.251.20.44:40114, rmtAddr=/10.251.19.248:47100, createTime=1604058723468, closeTime=0, bytesSent=17735247, bytesRcvd=1550895977, bytesSent0=19236, bytesRcvd0=864051, sndSchedTime=1604058723468, lastSndTime=1604062506627, lastRcvTime=1604062481469, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@3973847c, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]]] java.io.IOException: Connection reset by peer at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) at java.base/sun.nio.ch.SocketDispatcher.read(Unknown Source) at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) at java.base/sun.nio.ch.IOUtil.read(Unknown Source) at java.base/sun.nio.ch.IOUtil.read(Unknown Source) at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source) at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1324) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2449) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2216) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1857) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.base/java.lang.Thread.run(Unknown Source) [12:55:07,631][WARNING][grid-nio-worker-tcp-comm-1-#37][TcpCommunicationSpi] Client disconnected abruptly due to network connection loss or because the connection was left open on application shutdown. [cls=class o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer] [12:55:08,215][SEVERE][grid-nio-worker-tcp-comm-0-#36][TcpCommunicationSpi] Failed to read data from remote connection (will wait for 2000ms). class org.apache.ignite.IgniteCheckedException: Failed to select events on selector. at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2245) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1857) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.nio.channels.ClosedChannelException at java.base/java.nio.channels.spi.AbstractSelectableChannel.register(Unknown Source) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2060) ... 3 more [12:55:08,688][SEVERE][sys-#63][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:10,727][SEVERE][sys-#59][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:12,772][SEVERE][sys-#69][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:14,819][SEVERE][sys-#70][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:16,853][SEVERE][sys-#62][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:18,913][SEVERE][sys-#57][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:20,951][SEVERE][sys-#58][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:21,186][INFO][grid-timeout-worker-#35][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=3f58f4f5, uptime=9 days, 21:04:18.056] ^-- H/N/C [hosts=96, nodes=96, CPUs=1082] ^-- CPU [cur=-100%, avg=-100%, GC=0%] ^-- PageMemory [pages=16626106] ^-- Heap [used=23856MB, free=35.29%, comm=36864MB] ^-- Off-heap [used=65326MB, free=9.12%, comm=71760MB] ^-- sysMemPlc region [used=0MB, free=99.21%, comm=40MB] ^-- TxLog region [used=0MB, free=100%, comm=40MB] ^-- Default_Region region [used=65325MB, free=8.87%, comm=71680MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=14, qSize=0] [12:55:21,186][INFO][grid-timeout-worker-#35][IgniteKernal] FreeList [name=Default_Region##FreeList, buckets=256, dataPages=287347, reusePages=3169711] [12:55:23,002][SEVERE][sys-#66][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:25,049][SEVERE][sys-#61][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:27,097][SEVERE][sys-#65][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:29,165][SEVERE][sys-#60][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:31,207][SEVERE][sys-#67][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:33,241][SEVERE][sys-#63][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:35,278][SEVERE][sys-#59][TcpCommunicationSpi] Failed to send message to remote node [node=ZookeeperClusterNode [id=0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5, addrs=[127.0.0.1, 10.251.19.248], order=117, loc=false, client=false], msg=GridIoMessage [plc=2, topic=TOPIC_METRICS, topicOrd=29, ordered=false, timeout=0, skipOnTimeout=false, msg=ClusterMetricsUpdateMessage []]] class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote node does not observe current node in topology : 0123dd90-265e-4bbb-a4e8-ec9d33dd0ce5 at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.updateMetrics(ClusterProcessor.java:509) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.access$2200(ClusterProcessor.java:85) at org.apache.ignite.internal.processors.cluster.ClusterProcessor$MetricsUpdateTimeoutObject.run(ClusterProcessor.java:788) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) [12:55:37,081][WARNING][zk-client-timer-null][ZookeeperClient] Failed to establish ZooKeeper connection, close client [timeout=30000] [12:55:37,082][WARNING][zk-client-timer-null][ZookeeperDiscoveryImpl] Connection to Zookeeper server is lost, local node SEGMENTED. [12:55:37,083][WARNING][disco-event-worker-#71][GridDiscoveryManager] Local node SEGMENTED: ZookeeperClusterNode [id=3f58f4f5-bb5a-4650-91f1-ebc3e3a40dac, addrs=[10.251.20.44, 127.0.0.1], order=257, loc=true, client=false] [12:55:37,107][SEVERE][disco-event-worker-#71][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SEGMENTATION, err=null]] [12:55:37,114][WARNING][disco-event-worker-#71][CacheDiagnosticManager] Page locks dump: Thread=[name=data-streamer-stripe-0-#15, id=30], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-0-#15 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-1-#16, id=31], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-1-#16 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-10-#25, id=40], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-10-#25 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-11-#26, id=41], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-11-#26 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-12-#27, id=42], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-12-#27 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-13-#28, id=43], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-13-#28 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-2-#17, id=32], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-2-#17 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-3-#18, id=33], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-3-#18 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-4-#19, id=34], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-4-#19 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-5-#20, id=35], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-5-#20 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-6-#21, id=36], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-6-#21 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-7-#22, id=37], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-7-#22 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-8-#23, id=38], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-8-#23 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=data-streamer-stripe-9-#24, id=39], state=WAITING Locked pages = [] Locked pages log: name=data-streamer-stripe-9-#24 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=exchange-worker-#72, id=119], state=TIMED_WAITING Locked pages = [] Locked pages log: name=exchange-worker-#72 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#57, id=102], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#57 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#58, id=103], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#58 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#59, id=104], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#59 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#60, id=105], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#60 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#61, id=106], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#61 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#62, id=107], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#62 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#63, id=108], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#63 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#64, id=109], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#64 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#65, id=110], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#65 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#66, id=111], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#66 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#67, id=112], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#67 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#68, id=113], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#68 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#69, id=114], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#69 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-#70, id=115], state=TIMED_WAITING Locked pages = [] Locked pages log: name=sys-#70 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-0-#1, id=16], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-0-#1 time=(1604062537107, 2020-10-30 12:55:37.107) Thread=[name=sys-stripe-1-#2, id=17], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-1-#2 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-10-#11, id=26], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-10-#11 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-11-#12, id=27], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-11-#12 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-12-#13, id=28], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-12-#13 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-13-#14, id=29], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-13-#14 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-2-#3, id=18], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-2-#3 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-3-#4, id=19], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-3-#4 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-4-#5, id=20], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-4-#5 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-5-#6, id=21], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-5-#6 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-6-#7, id=22], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-6-#7 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-7-#8, id=23], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-7-#8 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-8-#9, id=24], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-8-#9 time=(1604062537108, 2020-10-30 12:55:37.108) Thread=[name=sys-stripe-9-#10, id=25], state=WAITING Locked pages = [] Locked pages log: name=sys-stripe-9-#10 time=(1604062537108, 2020-10-30 12:55:37.108) [12:55:37,115][SEVERE][disco-event-worker-#71][FailureProcessor] Ignite node is in invalid state due to a critical failure. [12:55:37,115][SEVERE][node-stopper][] Stopping local node on Ignite failure: [failureCtx=FailureContext [type=SEGMENTATION, err=null]] [12:55:37,118][INFO][node-stopper][GridTcpRestProtocol] Command protocol successfully stopped: TCP binary [12:55:37,126][INFO][node-stopper][GridJettyRestProtocol] Command protocol successfully stopped: Jetty REST [12:55:37,189][INFO][node-stopper][GridCacheProcessor] Stopped cache [cacheName=ignite-sys-cache] ... 974 caches are stoppen in this section ... [12:55:37,519][INFO][node-stopper][GridCacheProcessor] Stopped cache [cacheName=CVAR1-RE] [12:55:43,703][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,703][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,703][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,703][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePositionResultComputation, alias=c.a.f.s.a.ignite.IgnitePositionResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgnitePartialResultComputation, alias=c.a.f.s.a.ignite.IgnitePartialResultComputation] [12:55:43,704][INFO][node-stopper][GridDeploymentPerVersionStore] Class was undeployed in SHARED or CONTINUOUS mode [cls=class c.a.f.s.a.ignite.IgniteCacheService$IgniteComputeClusterGroupSizes, alias=c.a.f.s.a.ignite.IgniteCacheService$IgniteComputeClusterGroupSizes] [12:55:43,705][INFO][node-stopper][GridDeploymentLocalStore] Removed undeployed class: GridDeployment [ts=1603209014896, depMode=SHARED, clsLdr=jdk.internal.loader.ClassLoaders$AppClassLoader@6a2f6f80, clsLdrId=5f324b64571-3f58f4f5-bb5a-4650-91f1-ebc3e3a40dac, userVer=0, loc=true, sampleClsName=org.apache.ignite.internal.processors.continuous.GridContinuousProcessor, pendingUndeploy=false, undeployed=true, usage=0] [12:55:43,711][INFO][node-stopper][IgniteKernal] >>> +---------------------------------------------------------------------------------+ >>> Ignite ver. 2.8.1#20200521-sha1:864220966caa4157c4fee8a1bc85171623963604 >>> stopped OK >>> +---------------------------------------------------------------------------------+ >>> Grid uptime: 9 days, 21:04:40.583 -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/