Which version of Mina are you using or are you building from Git?
Please pull tag/2.0.16 from GIT and apply the attached patch. Let me
know if that fixes your problem. Sorry about the excess changes in
the patch; the java code formatter made a lot of changes. If this
works then we can create a JIRA bug.
On Tue, Oct 10, 2017 at 4:49 AM, Christoph John
<christoph.j...@macd.com <mailto:christoph.j...@macd.com>> wrote:
Hi,
thanks for your reply.
In fact it is hanging forever, i.e. until the process stops. I
have attached the original
message I've sent to the mailing list. It only does occur
sometimes for SSL connections with a
failing handshake.
Unfortunately I have no reproducable example for MINA itself. I
could probably put something
together for QuickFIX/J (the open source project I am working on).
My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not
so often on my machine but
almost every time on the TravisCI build server
(https://travis-ci.org/quickfix-j/quickfixj/builds/283210509
<https://travis-ci.org/quickfix-j/quickfixj/builds/283210509>).
As a result, some of the SSL
related tests are failing. TravisCI has almost similar setup with
JDK1.8_144 and Debian Linux.
What would be a good starting point to create a test? I see that
there is an SslTest in the
mina-core module. So I probably have to change that test to
repeatedly connect and get a
handshake exception everytime and then take a number of stack
traces.
Thanks,
Chris.
On 09/10/17 14:51, Jonathan Valliere wrote:
What OS / Java Version / etc; Do you have a reproducible example?
On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere
<jon.valli...@emoten.com
<mailto:jon.valli...@emoten.com>> wrote:
Let me know if its hanging more than 1s
On Mon, Oct 9, 2017 at 5:08 AM, Christoph John
<christoph.j...@macd.com
<mailto:christoph.j...@macd.com>> wrote:
Hi,
I have another question regarding this one. There is
https://issues.apache.org/jira/browse/DIRMINA-1060
<https://issues.apache.org/jira/browse/DIRMINA-1060>
which also sounds a little like
the problem I'm having. When the connectors are hanging
in the call to dispose() then
there always is an accompanying NioProcessor which is
hanging in select().
Example:
"NioProcessor-60" #100328 prio=5 os_prio=0
tid=0x00007f2a10003000 nid=0x2e71 runnable
[0x00007f2a388b1000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native
Method)
at
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000e239c118> (a sun.nio.ch.Util$3)
- locked <0x00000000e239c108> (a
java.util.Collections$UnmodifiableSet)
- locked <0x00000000e239bed0> (a
sun.nio.ch.EPollSelectorImpl)
at
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"NioSocketConnector-38" #100326 prio=5 os_prio=0
tid=0x00007f2a3001d800 nid=0x2e6f in
Object.wait() [0x00007f2a1f2d3000]
java.lang.Thread.State: TIMED_WAITING (on object
monitor)
at java.lang.Object.wait(Native Method)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
- locked <0x00000000e246ae08> (a
org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
- locked <0x00000000e246ae40> (a java.lang.Object)
at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
At first I thought that this was related to
https://issues.apache.org/jira/browse/DIRMINA-1059
<https://issues.apache.org/jira/browse/DIRMINA-1059>. In
that ticket the
synchronization was improved. However, I am also running
into the problem with a
build of 2.0.17-SNAPSHOT where DIRMINA-1059 was solved.
So my only hope was DIRMINA-1060 ;) Could this improve
the situation?
Thanks,
Chris.
-- Christoph John
Development & Support
Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
Mailto:christoph.j...@macd.com
<mailto:christoph.j...@macd.com>
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
<https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g>
D-52066 Aachen
Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> |
Fax: +49 241 557080-10
<tel:%2B49%20241%20557080-10>
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary
-- Christoph John
Development & Support
Direct: +49 241 557080-28 <tel:+49%20241%2055708028>
Mailto:christoph.j...@macd.com
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary
---------- Forwarded message ----------
From: Christoph John <christoph.j...@macd.com
<mailto:christoph.j...@macd.com>>
To: dev@mina.apache.org <mailto:dev@mina.apache.org>
Cc:
Bcc:
Date: Wed, 26 Jul 2017 13:59:58 +0200
Subject: leaking NioProcessors/NioSocketConnectors hanging in
call to dispose
Hi,
I am a developer and maintainer of the QuickFIX/J project
(https://github.com/quickfix-j/quickfixj
<https://github.com/quickfix-j/quickfixj>) and I have
a question regarding NioSocketConnectors.
We are facing a problem when there is a process that constantly
(every 30 seconds) tries to
connect to a counterparty and the connection is established but
dropped shortly after. Then
sometimes the NioProcessors/NioSocketConnectors are not cleaned
up properly. In the stack
trace we see them hanging in a call to dispose:
"NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800
nid=0x2523 runnable
[0x00007f9c67e8f000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000f6699e60> (a sun.nio.ch.Util$3)
- locked <0x00000000f6699e50> (a
java.util.Collections$UnmodifiableSet)
- locked <0x00000000f6699c18> (a
sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
"NioSocketConnector-68" #238 prio=5 os_prio=0
tid=0x00007f9c70caf000 nid=0x2522 in
Object.wait() [0x00007f9c6af9f000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
- locked <0x00000000f66ac718> (a
org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
- locked <0x00000000f66ac750> (a java.lang.Object)
at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
It does not happen very often: about 5% of the connection
attempts leave a NioSocketConnector
hanging.
It only seems to happen though when the connection is
disconnected by
"javax.net.ssl.SSLHandshakeException: SSL handshake failed".
Although there are cases when
there is no leak even on an SSLHandshakeException.
If the connection was reset "normally" by "java.io.IOException:
Connection reset by peer" then
the leak does not seem to occur. It also does not occur when the
connection is refused right away.
Since this seems to be related to SSL connections: is there
something that we need to take
care of when using the SSL filter?
The code for the IoSessionInitiator can be found here:
https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java
<https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java>
I have added some comments in this gist (starting with "chrjohn"):
https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b
<https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b>
I cannot rule out that we might be doing something wrong here, so
any pointer is appreciated. :)
Thanks in advance for your help and best regards,
Chris.
-- Christoph John
Development & Support
Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
Mailto:christoph.j...@macd.com <mailto:christoph.j...@macd.com>
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> | Fax: +49 241
557080-10
<tel:%2B49%20241%20557080-10>
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary