Hi Apache MINA team,
I'm seeking advice on how to handle an apparent race condition encountered
using mina 2.0.27, whether my analysis is correct and if this is a known issue
(I haven't identified one open in
https://issues.apache.org/jira/projects/DIRMINA/issues/DIRMINA-1186?filter=allopenissues).
Issue Description
Our application server uses MINA for NIO between a data cache and several
client applications. Recently we have seen increased incidence of the following
warning-level message in the data cache logs:
Create a new selector. Selected is 0, delta = 0
I found this is logged from the AbstractPollingIoProcessor and is accompanied
by a comment block explaining that there is an Epoll race condition that can
cause file descriptors not to be considered available and the selector must be
closed and a new one registered to prevent 100% CPU use. It's not entirely
clear whether this issue is in MINA, the JDK or the operating system, but
either way this all seems fine and most of our application instances do not
encounter a further problem.
On one production instance of our application it looks like a second race
condition occurs when the selector is closed and replaced. The greater the
frequency of the "Create a new selector. Selected is 0, delta = 0" message, the
more likely that we encounter a ClosedSelectorException on another thread. Full
stack trace and logs before and after:
WARN 2025-09-10 21:58:29,212 [NioProcessor-2]-service.IoProcessor: Create a new
selector. Selected is 0, delta = 0
WARN 2025-09-10 21:58:29,220 [pool-3-thread-2]-server.DsrvServerIoHandler: null
java.nio.channels.ClosedSelectorException: null
at
sun.nio.ch.EPollSelectorImpl.ensureOpen(EPollSelectorImpl.java:98) ~[?:?]
at
sun.nio.ch.EPollSelectorImpl.setEventOps(EPollSelectorImpl.java:243) ~[?:?]
at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:101) ~[?:?]
at
org.apache.mina.transport.socket.nio.NioProcessor.setInterestedInWrite(NioProcessor.java:371)
~[mina-core-2.0.27.jar:?]
at
org.apache.mina.transport.socket.nio.NioProcessor.setInterestedInWrite(NioProcessor.java:47)
~[mina-core-2.0.27.jar:?]
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.updateTrafficControl(AbstractPollingIoProcessor.java:585)
[mina-core-2.0.27.jar:?]
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.updateTrafficControl(AbstractPollingIoProcessor.java:68)
[mina-core-2.0.27.jar:?]
at
org.apache.mina.core.service.SimpleIoProcessorPool.updateTrafficControl(SimpleIoProcessorPool.java:294)
[mina-core-2.0.27.jar:?]
at
org.apache.mina.core.service.SimpleIoProcessorPool.updateTrafficControl(SimpleIoProcessorPool.java:80)
[mina-core-2.0.27.jar:?]
at
org.apache.mina.core.session.AbstractIoSession.resumeRead(AbstractIoSession.java:748)
[mina-core-2.0.27.jar:?]
at
j4sf.connect.dsrv.endpoint.IoSessionEndPoint.resumeRead(IoSessionEndPoint.java:73)
[DataServer-11.0.2.jar:?]
at
j4sf.connect.dsrv.server.SerialExecutor.scheduleNext(SerialExecutor.java:94)
[DataServer-11.0.2.jar:?]
at
j4sf.connect.dsrv.server.SerialExecutor$LocalTask.run(SerialExecutor.java:140)
[DataServer-11.0.2.jar:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
INFO 2025-09-10 21:58:29,235 [NioProcessor-2]-server.DsrvServerIoHandler:
Session Closed
Our code in the j4sf.connect.dsrv.server.SerialExecutor and
j4sf.connect.dsrv.endpoint.IoSessionEndpoint is handling reading incoming
messages on the IoSession. There is a message processing queues with a high and
low watermark. At some point we have exceeded the high watermark and called
suspendRead() on the IoSession. The ClosedConnectorException above has thrown
when we called ioSession.resumeRead().
Our DsrvServerIoHandler extends org.apache.mina.core.service.IoHandlerAdapter -
the exception is caught here where we override
IoHandlerAdapter.exceptionCaught(IoSession, Throwable). At this point we log
the exception and close the session, resulting in the final log message above
after the session has closed.
We've been unable to reproduce this on a hardware/software platform under our
control.
I don't see any mention of ClosedSelectorException in your mail archive since
handling of them was fixed in
DIRMINA-978<https://issues.apache.org/jira/browse/DIRMINA-978>
Questions:
1. Could this be a race condition with the resumeRead causing the
NioProcessor retrieving the old selection key just before it was closed, then
attempting the read after it has been closed?
1. The IoSession is already suspended when this happens - is there any
reason inherent in MINA why we shouldn't ignore this exception, at least once,
and simply attempt to resumeRead() again. If I could reproduce the issue
outside prod, I would of course try this myself.
If not, do you have any other advice on how the issue might be worked around.
1. Might upgrade to MINA 2.1 or 2.2 help? I see nothing obvious I the
release notes that suggests it would. We have already upgraded from 2.0.21 to
2.0.27 which seems to have reduced the frequency of the issue - perhaps due to
the inclusion on
DIRMINA-1169<https://issues.apache.org/jira/browse/DIRMINA-1169> in 2.0.24.
1. Finally, do you know anything about likely causes of the initial Epoll
race condition the MINA code is working around. If it's caused by java, is
there an OpenJDK issue open related to it? I couldn't find anything that seem
to match. FWIW the system the issue occurs on is running RHEL 9.5 and the JDK
is also Red Hat's java 11: Red_Hat-11.0.17.0.8-2.el7openjdkportable
Thanks!
Adam
This message and any attachments are intended only for the use of the addressee
and may contain information that is privileged and confidential. If the reader
of the message is not the intended recipient or an authorized representative of
the intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication
in error, please notify us immediately by e-mail and delete the message and any
attachments from your system.