Mikhail Petrov created IGNITE-24945:
---------------------------------------
Summary: Node failure in with
java.nio.channels.CancelledKeyException
Key: IGNITE-24945
URL: https://issues.apache.org/jira/browse/IGNITE-24945
Project: Ignite
Issue Type: Bug
Reporter: Mikhail Petrov
Exception:
{code:java}
2025-02-18 13:46:36.468 [ERROR][tcp-comm-worker-#1-#138][] Critical system
error detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
err=java.nio.channels.CancelledKeyException]]
java.nio.channels.CancelledKeyException: null
at
java.base/sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:71)
~[?:?]
at
java.base/sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:90)
~[?:?]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.registerWrite(GridNioServer.java:2352)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionWrite(GridNioServer.java:3738)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:121)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.writeNetBuffer(GridNioSslHandler.java:502)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.shutdownSession(GridNioSslFilter.java:454)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.onSessionClose(GridNioSslFilter.java:434)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClose(GridNioFilterAdapter.java:128)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onSessionClose(GridConnectionBytesVerifyFilter.java:138)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClose(GridNioFilterAdapter.java:128)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionClose(GridNioCodecFilter.java:137)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClose(GridNioFilterAdapter.java:128)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionClose(GridNioFilterChain.java:274)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClose(GridNioFilterChain.java:203)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridNioSessionImpl.close(GridNioSessionImpl.java:169)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridSelectorNioSessionImpl.close(GridSelectorNioSessionImpl.java:498)
~[classes/:?]
at
org.apache.ignite.internal.util.nio.GridTcpNioCommunicationClient.close(GridTcpNioCommunicationClient.java:81)
~[classes/:?]
at
org.apache.ignite.spi.communication.tcp.internal.CommunicationWorker.processIdle(CommunicationWorker.java:278)
~[classes/:?]
at
org.apache.ignite.spi.communication.tcp.internal.CommunicationWorker.body(CommunicationWorker.java:176)
~[classes/:?]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
~[classes/:?]
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$3.body(TcpCommunicationSpi.java:848)
~[classes/:?]
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
~[classes/:?]
{code}
Reason:
Multiple threads can access one SelectionKey. In this particular case it is -
tcp-comm-worker and grid-nio-worker.
1. tcp-comm-worker - checks that key is not closed by invoking
SelectionKey#isValid()
2. grid-nio-worker - closes the SelectionKey
3. tcp-comm-worker - invokes SelectionKey#interestOps which raises
CancelledKeyException, which in its turn leads to the node failure.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)