Le 26/07/2017 à 13:59, Christoph John a écrit : > Hi, > > I am a developer and maintainer of the QuickFIX/J project > (https://github.com/quickfix-j/quickfixj) and I have a question > regarding NioSocketConnectors. > > We are facing a problem when there is a process that constantly (every > 30 seconds) tries to connect to a counterparty and the connection is > established but dropped shortly after. Then sometimes the > NioProcessors/NioSocketConnectors are not cleaned up properly. In the > stack trace we see them hanging in a call to dispose: > > "NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800 > nid=0x2523 runnable [0x00007f9c67e8f000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > - locked <0x00000000f6699e60> (a sun.nio.ch.Util$3) > - locked <0x00000000f6699e50> (a > java.util.Collections$UnmodifiableSet) > - locked <0x00000000f6699c18> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at > org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98) > at > org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075) > at > org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > "NioSocketConnector-68" #238 prio=5 os_prio=0 tid=0x00007f9c70caf000 > nid=0x2522 in Object.wait() [0x00007f9c6af9f000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIoFuture.java:209) > - locked <0x00000000f66ac718> (a > org.apache.mina.core.future.DefaultIoFuture) > at > org.apache.mina.core.future.DefaultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141) > at > org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188) > at > org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329) > - locked <0x00000000f66ac750> (a java.lang.Object) > at > org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582) > at > org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > It does not happen very often: about 5% of the connection attempts > leave a NioSocketConnector hanging. > It only seems to happen though when the connection is disconnected by > "javax.net.ssl.SSLHandshakeException: SSL handshake failed". Although > there are cases when there is no leak even on an SSLHandshakeException. > If the connection was reset "normally" by "java.io.IOException: > Connection reset by peer" then the leak does not seem to occur. It > also does not occur when the connection is refused right away. > > Since this seems to be related to SSL connections: is there something > that we need to take care of when using the SSL filter? > > The code for the IoSessionInitiator can be found here: > https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java > I have added some comments in this gist (starting with "chrjohn"): > https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b > > I cannot rule out that we might be doing something wrong here, so any > pointer is appreciated. :)
I see in your code that you are waiting 2s for the connection to be established, and if this timeout is reached, you try again, up to teh point you bail out. In tjis case, teh connection is not cleared up, AFAICT. Is that correct ? OTOH, it does not necessarily makes a lot of sense to poll the connector : as MINA is fully asynchronous, you'll be informed when the connection is established, and if not, you can use the idle event to know that your connection is idling (an idle event is generated every second, so waiting for, say, 30 idle events will let you manage a 30s timeout, for instance). If your connection idle for too long, simply dispose it. > > Thanks in advance for your help and best regards, > Chris. > -- Emmanuel Lecharny Symas.com directory.apache.org