Hello,
We have 1 client node and 1 server node and we are using ignite version
2.9.1.
Our application is scheduled to do the same jobs every day. Then our
application did not get any errors for 2 weeks, but 2 weeks later, we are
getting this error as you can see below (We get such an error about every 2
weeks):
I hope you support to solve my problem. Thanks and best regards...
2021-02-14 02:07:34 WARN tcp-client-disco-reconnector-#7-#77756
TcpDiscoverySpi:576 - Failed to connect to any address from IP finder (will
retry to join topology every 2000 ms; change 'reconnectDelay' to configure
the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
/127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
/127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
2021-02-14 02:07:37 INFO grid-timeout-worker-#206 IgniteKernal:566 -
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
minorTopVer=18985]
^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
commPort=47101]
^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
^-- Heap [used=865MB, free=92.96%, comm=12274MB]
^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
^-- Page memory [pages=0]
^-- sysMemPlc region [type=internal, persistence=false,
lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
^-- TxLog region [type=internal, persistence=false, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
^-- Default_Region region [type=default, persistence=false,
lazyAlloc=true,
... initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
allocRam=0MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=81, qSize=0]
2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
TcpDiscoverySpi:586 - Failed to send message: null
java.io.IOException: Failed to get acknowledge for message:
TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
[sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
isClient=true]]
at
org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
2021-02-14 02:07:44 WARN tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
currTimeout=600000]], addr=/127.0.0.1:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:07:54 WARN tcp-comm-worker-#1-#216 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
2021-02-14 02:08:06 WARN grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:06] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:07 WARN grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:
2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
2021-02-14 02:08:16 WARN grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:16] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:16 WARN grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:
2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
2021-02-14 02:08:28 WARN grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:28] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:28 WARN grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:
2021-02-14 02:08:32 WARN http-nio-8082-exec-5 TcpCommunicationSpi:576 -
Handshake timed out (will stop attempts to perform the handshake)
[node=6953d599-d606-4781-a6ba-43de7aff59e4,
connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
err=Operation timed out [timeoutStrategy= ExponentialBackoffTimeoutStrategy
[maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
currTimeout=600000]], addr=/127.0.0.1:47100,
failureDetectionTimeoutEnabled=true, timeout=0]
2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=tcp-comm-worker,
threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
2021-02-14 02:08:37 WARN grid-timeout-worker-#206 root:576 - Possible
failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at
org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
at
org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
[02:08:37] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
2021-02-14 02:08:37 WARN grid-timeout-worker-#206
CacheDiagnosticManager:571 - Page locks dump:
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/