Which version of Mina are you using or are you building from Git? Please pull tag/2.0.16 from GIT and apply the attached patch. Let me know if that fixes your problem. Sorry about the excess changes in the patch; the java code formatter made a lot of changes. If this works then we can create a JIRA bug.
On Tue, Oct 10, 2017 at 4:49 AM, Christoph John <christoph.j...@macd.com> wrote: > Hi, > > thanks for your reply. > In fact it is hanging forever, i.e. until the process stops. I have > attached the original message I've sent to the mailing list. It only does > occur sometimes for SSL connections with a failing handshake. > Unfortunately I have no reproducable example for MINA itself. I could > probably put something together for QuickFIX/J (the open source project I > am working on). > > My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not so often > on my machine but almost every time on the TravisCI build server ( > https://travis-ci.org/quickfix-j/quickfixj/builds/283210509). As a > result, some of the SSL related tests are failing. TravisCI has almost > similar setup with JDK1.8_144 and Debian Linux. > > What would be a good starting point to create a test? I see that there is > an SslTest in the mina-core module. So I probably have to change that test > to repeatedly connect and get a handshake exception everytime and then take > a number of stack traces. > > Thanks, > Chris. > > > > > > On 09/10/17 14:51, Jonathan Valliere wrote: > > What OS / Java Version / etc; Do you have a reproducible example? > > On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere <jon.valli...@emoten.com > > wrote: > >> Let me know if its hanging more than 1s >> >> On Mon, Oct 9, 2017 at 5:08 AM, Christoph John <christoph.j...@macd.com> >> wrote: >> >>> Hi, >>> >>> I have another question regarding this one. There is >>> https://issues.apache.org/jira/browse/DIRMINA-1060 which also sounds a >>> little like the problem I'm having. When the connectors are hanging in the >>> call to dispose() then there always is an accompanying NioProcessor which >>> is hanging in select(). >>> >>> Example: >>> "NioProcessor-60" #100328 prio=5 os_prio=0 tid=0x00007f2a10003000 >>> nid=0x2e71 runnable [0x00007f2a388b1000] >>> java.lang.Thread.State: RUNNABLE >>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) >>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) >>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java >>> :93) >>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) >>> - locked <0x00000000e239c118> (a sun.nio.ch.Util$3) >>> - locked <0x00000000e239c108> (a java.util.Collections$Unmodifi >>> ableSet) >>> - locked <0x00000000e239bed0> (a sun.nio.ch.EPollSelectorImpl) >>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) >>> at org.apache.mina.transport.socket.nio.NioProcessor.select(Nio >>> Processor.java:98) >>> at org.apache.mina.core.polling.AbstractPollingIoProcessor$Proc >>> essor.run(AbstractPollingIoProcessor.java:1075) >>> at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi >>> ngRunnable.java:64) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> Executor.java:1149) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> lExecutor.java:624) >>> at java.lang.Thread.run(Thread.java:748) >>> >>> >>> "NioSocketConnector-38" #100326 prio=5 os_prio=0 tid=0x00007f2a3001d800 >>> nid=0x2e6f in Object.wait() [0x00007f2a1f2d3000] >>> java.lang.Thread.State: TIMED_WAITING (on object monitor) >>> at java.lang.Object.wait(Native Method) >>> at org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIo >>> Future.java:209) >>> - locked <0x00000000e246ae08> (a org.apache.mina.core.future.De >>> faultIoFuture) >>> at org.apache.mina.core.future.DefaultIoFuture.awaitUninterrupt >>> ibly(DefaultIoFuture.java:141) >>> at org.apache.mina.core.polling.AbstractPollingIoProcessor.disp >>> ose(AbstractPollingIoProcessor.java:188) >>> at org.apache.mina.core.service.SimpleIoProcessorPool.dispose(S >>> impleIoProcessorPool.java:329) >>> - locked <0x00000000e246ae40> (a java.lang.Object) >>> at org.apache.mina.core.polling.AbstractPollingIoConnector$Conn >>> ector.run(AbstractPollingIoConnector.java:582) >>> at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi >>> ngRunnable.java:64) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> Executor.java:1149) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> lExecutor.java:624) >>> at java.lang.Thread.run(Thread.java:748) >>> >>> >>> At first I thought that this was related to >>> https://issues.apache.org/jira/browse/DIRMINA-1059. In that ticket the >>> synchronization was improved. However, I am also running into the problem >>> with a build of 2.0.17-SNAPSHOT where DIRMINA-1059 was solved. >>> >>> So my only hope was DIRMINA-1060 ;) Could this improve the situation? >>> >>> Thanks, >>> Chris. >>> >>> >>> -- >>> Christoph John >>> Development & Support >>> Direct: +49 241 557080-28 >>> Mailto:christoph.j...@macd.com >>> >>> >>> >>> http://www.macd.com <http://www.macd.com/> >>> ------------------------------------------------------------ >>> ---------------------------------------- >>> >>> ------------------------------------------------------------ >>> ---------------------------------------- >>> MACD GmbH >>> Oppenhoffallee 103 >>> <https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g> >>> D-52066 Aachen >>> Tel: +49 241 557080-0 | Fax: +49 241 557080-10 >>> Amtsgericht Aachen: HRB 8151 >>> Ust.-Id: DE 813021663 >>> >>> Geschäftsführer: George Macdonald >>> ------------------------------------------------------------ >>> ---------------------------------------- >>> >>> ------------------------------------------------------------ >>> ---------------------------------------- >>> >>> take care of the environment - print only if necessary >>> >> >> > > -- > Christoph John > Development & Support > Direct: +49 241 557080-28 <+49%20241%2055708028> > Mailto:christoph.j...@macd.com <christoph.j...@macd.com> > > > > http://www.macd.com > ------------------------------ > ------------------------------ > MACD GmbH > Oppenhoffallee 103 > D-52066 Aachen > Tel: +49 241 557080-0 | Fax: +49 241 557080-10 > Amtsgericht Aachen: HRB 8151 > Ust.-Id: DE 813021663 > > Geschäftsführer: George Macdonald > ------------------------------ > ------------------------------ > take care of the environment - print only if necessary > > > ---------- Forwarded message ---------- > From: Christoph John <christoph.j...@macd.com> > To: dev@mina.apache.org > Cc: > Bcc: > Date: Wed, 26 Jul 2017 13:59:58 +0200 > Subject: leaking NioProcessors/NioSocketConnectors hanging in call to > dispose > Hi, > > I am a developer and maintainer of the QuickFIX/J project ( > https://github.com/quickfix-j/quickfixj) and I have a question regarding > NioSocketConnectors. > > We are facing a problem when there is a process that constantly (every 30 > seconds) tries to connect to a counterparty and the connection is > established but dropped shortly after. Then sometimes the > NioProcessors/NioSocketConnectors are not cleaned up properly. In the > stack trace we see them hanging in a call to dispose: > > "NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800 > nid=0x2523 runnable [0x00007f9c67e8f000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java > :93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > - locked <0x00000000f6699e60> (a sun.nio.ch.Util$3) > - locked <0x00000000f6699e50> (a java.util.Collections$Unmodifi > ableSet) > - locked <0x00000000f6699c18> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at org.apache.mina.transport.socket.nio.NioProcessor.select(Nio > Processor.java:98) > at org.apache.mina.core.polling.AbstractPollingIoProcessor$Proc > essor.run(AbstractPollingIoProcessor.java:1075) > at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi > ngRunnable.java:64) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool > Executor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo > lExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > "NioSocketConnector-68" #238 prio=5 os_prio=0 tid=0x00007f9c70caf000 > nid=0x2522 in Object.wait() [0x00007f9c6af9f000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIo > Future.java:209) > - locked <0x00000000f66ac718> (a org.apache.mina.core.future.De > faultIoFuture) > at org.apache.mina.core.future.DefaultIoFuture.awaitUninterrupt > ibly(DefaultIoFuture.java:141) > at org.apache.mina.core.polling.AbstractPollingIoProcessor.disp > ose(AbstractPollingIoProcessor.java:188) > at org.apache.mina.core.service.SimpleIoProcessorPool.dispose(S > impleIoProcessorPool.java:329) > - locked <0x00000000f66ac750> (a java.lang.Object) > at org.apache.mina.core.polling.AbstractPollingIoConnector$Conn > ector.run(AbstractPollingIoConnector.java:582) > at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi > ngRunnable.java:64) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool > Executor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo > lExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > It does not happen very often: about 5% of the connection attempts leave a > NioSocketConnector hanging. > It only seems to happen though when the connection is disconnected by > "javax.net.ssl.SSLHandshakeException: SSL handshake failed". Although > there are cases when there is no leak even on an SSLHandshakeException. > If the connection was reset "normally" by "java.io.IOException: Connection > reset by peer" then the leak does not seem to occur. It also does not occur > when the connection is refused right away. > > Since this seems to be related to SSL connections: is there something that > we need to take care of when using the SSL filter? > > The code for the IoSessionInitiator can be found here: > https://github.com/quickfix-j/quickfixj/blob/master/quickfix > j-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java > I have added some comments in this gist (starting with "chrjohn"): > https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b > > I cannot rule out that we might be doing something wrong here, so any > pointer is appreciated. :) > > Thanks in advance for your help and best regards, > Chris. > > -- > Christoph John > Development & Support > Direct: +49 241 557080-28 > Mailto:christoph.j...@macd.com > > > > http://www.macd.com <http://www.macd.com/> > ------------------------------------------------------------ > ---------------------------------------- > > ------------------------------------------------------------ > ---------------------------------------- > MACD GmbH > Oppenhoffallee 103 > D-52066 Aachen > Tel: +49 241 557080-0 | Fax: +49 241 557080-10 > Amtsgericht Aachen: HRB 8151 > Ust.-Id: DE 813021663 > > Geschäftsführer: George Macdonald > ------------------------------------------------------------ > ---------------------------------------- > > ------------------------------------------------------------ > ---------------------------------------- > > take care of the environment - print only if necessary > >