Thanks, will try. Did this patch apply to 2.0.16? The line numbers do not match. I have added it manually in line 1162 and removed the statement from line 1164.

Cheers,
Chris.

On 11/10/17 14:19, Emmanuel Lécharny wrote:
Hi,


can you test with this patch ?


diff --git
a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
index 50ebd4e..575b2f4 100644
---
a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
+++
b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
@@ -695,8 +695,9 @@
                          for (Iterator<S> i = allSessions(); i.hasNext();) {
                              IoSession session = i.next();
+                            scheduleRemove((S) session);
+
                              if (session.isActive()) {
-                                scheduleRemove((S) session);
                                  hasKeys = true;
                              }
                          }


Le 11/10/2017 à 12:04, Christoph John a écrit :
Hi,

thanks for the patch. I am using 2.0.16.
Oddly enough when I run all MINA tests then the ConnectorTest is
hanging on my machine in the testTCPWithSSL method. But I don't know
if this is related. (stack trace attached)
However, I will try out your patch and let you know.

Thanks again,
Chris.


On 10/10/17 20:47, Jonathan Valliere wrote:
Which version of Mina are you using or are you building from Git?

Please pull tag/2.0.16 from GIT and apply the attached patch. Let me
know if that fixes your problem.  Sorry about the excess changes in
the patch; the java code formatter made a lot of changes. If this
works then we can create a JIRA bug.

On Tue, Oct 10, 2017 at 4:49 AM, Christoph John
<christoph.j...@macd.com <mailto:christoph.j...@macd.com>> wrote:

     Hi,

     thanks for your reply.
     In fact it is hanging forever, i.e. until the process stops. I
have attached the original
     message I've sent to the mailing list. It only does occur
sometimes for SSL connections with a
     failing handshake.
     Unfortunately I have no reproducable example for MINA itself. I
could probably put something
     together for QuickFIX/J (the open source project I am working on).

     My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not
so often on my machine but
     almost every time on the TravisCI build server
     (https://travis-ci.org/quickfix-j/quickfixj/builds/283210509
     <https://travis-ci.org/quickfix-j/quickfixj/builds/283210509>).
As a result, some of the SSL
     related tests are failing. TravisCI has almost similar setup with
JDK1.8_144 and Debian Linux.

     What would be a good starting point to create a test? I see that
there is an SslTest in the
     mina-core module. So I probably have to change that test to
repeatedly connect and get a
     handshake exception everytime and then take a number of stack
traces.

     Thanks,
     Chris.





     On 09/10/17 14:51, Jonathan Valliere wrote:
     What OS / Java Version / etc;  Do you have a reproducible example?

     On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere
<jon.valli...@emoten.com
     <mailto:jon.valli...@emoten.com>> wrote:

         Let me know if its hanging more than 1s

         On Mon, Oct 9, 2017 at 5:08 AM, Christoph John
<christoph.j...@macd.com
         <mailto:christoph.j...@macd.com>> wrote:

             Hi,

             I have another question regarding this one. There is
             https://issues.apache.org/jira/browse/DIRMINA-1060
             <https://issues.apache.org/jira/browse/DIRMINA-1060>
which also sounds a little like
             the problem I'm having. When the connectors are hanging
in the call to dispose() then
             there always is an accompanying NioProcessor which is
hanging in select().

             Example:
             "NioProcessor-60" #100328 prio=5 os_prio=0
tid=0x00007f2a10003000 nid=0x2e71 runnable
             [0x00007f2a388b1000]
                java.lang.Thread.State: RUNNABLE
                     at sun.nio.ch.EPollArrayWrapper.epollWait(Native
Method)
                     at
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
                     at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
                     at
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
                     - locked <0x00000000e239c118> (a sun.nio.ch.Util$3)
                     - locked <0x00000000e239c108> (a
java.util.Collections$UnmodifiableSet)
                     - locked <0x00000000e239bed0> (a
sun.nio.ch.EPollSelectorImpl)
                     at
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
                     at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
                     at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
                     at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
                     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                     at java.lang.Thread.run(Thread.java:748)


             "NioSocketConnector-38" #100326 prio=5 os_prio=0
tid=0x00007f2a3001d800 nid=0x2e6f in
             Object.wait() [0x00007f2a1f2d3000]
                java.lang.Thread.State: TIMED_WAITING (on object
monitor)
                     at java.lang.Object.wait(Native Method)
                     at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
                     - locked <0x00000000e246ae08> (a
org.apache.mina.core.future.De
             <http://org.apache.mina.core.future.De>faultIoFuture)
                     at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
                     at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
                     at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
                     - locked <0x00000000e246ae40> (a java.lang.Object)
                     at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
                     at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
                     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                     at java.lang.Thread.run(Thread.java:748)


             At first I thought that this was related to
             https://issues.apache.org/jira/browse/DIRMINA-1059
             <https://issues.apache.org/jira/browse/DIRMINA-1059>. In
that ticket the
             synchronization was improved. However, I am also running
into the problem with a
             build of 2.0.17-SNAPSHOT where DIRMINA-1059 was solved.

             So my only hope was DIRMINA-1060 ;) Could this improve
the situation?

             Thanks,
             Chris.


             --             Christoph John
             Development & Support
             Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
             Mailto:christoph.j...@macd.com
<mailto:christoph.j...@macd.com>



             http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------
             MACD GmbH
             Oppenhoffallee 103
<https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g>
             D-52066 Aachen
             Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> |
Fax: +49 241 557080-10
             <tel:%2B49%20241%20557080-10>
                      Amtsgericht Aachen: HRB 8151
             Ust.-Id: DE 813021663

             Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------

             take care of the environment - print only if necessary



     --     Christoph John
     Development & Support
     Direct: +49 241 557080-28 <tel:+49%20241%2055708028>
     Mailto:christoph.j...@macd.com

     http://www.macd.com <http://www.macd.com/>
---------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------
     MACD GmbH
     Oppenhoffallee 103
     D-52066 Aachen
     Tel: +49 241 557080-0 | Fax: +49 241 557080-10
          Amtsgericht Aachen: HRB 8151
     Ust.-Id: DE 813021663

     Geschäftsführer: George Macdonald
---------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------

     take care of the environment - print only if necessary


     ---------- Forwarded message ----------
     From: Christoph John <christoph.j...@macd.com
<mailto:christoph.j...@macd.com>>
     To: dev@mina.apache.org <mailto:dev@mina.apache.org>
     Cc:
     Bcc:
     Date: Wed, 26 Jul 2017 13:59:58 +0200
     Subject: leaking NioProcessors/NioSocketConnectors hanging in
call to dispose
     Hi,

     I am a developer and maintainer of the QuickFIX/J project
     (https://github.com/quickfix-j/quickfixj
<https://github.com/quickfix-j/quickfixj>) and I have
     a question regarding NioSocketConnectors.

     We are facing a problem when there is a process that constantly
(every 30 seconds) tries to
     connect to a counterparty and the connection is established but
dropped shortly after. Then
     sometimes the NioProcessors/NioSocketConnectors are not cleaned
up properly. In the stack
     trace we see them hanging in a call to dispose:

     "NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800
nid=0x2523 runnable
     [0x00007f9c67e8f000]
        java.lang.Thread.State: RUNNABLE
             at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
             at
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
             at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
             at
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
             - locked <0x00000000f6699e60> (a sun.nio.ch.Util$3)
             - locked <0x00000000f6699e50> (a
java.util.Collections$UnmodifiableSet)
             - locked <0x00000000f6699c18> (a
sun.nio.ch.EPollSelectorImpl)
             at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
             at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
             at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
             at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
             at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
             at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
             at java.lang.Thread.run(Thread.java:748)

     "NioSocketConnector-68" #238 prio=5 os_prio=0
tid=0x00007f9c70caf000 nid=0x2522 in
     Object.wait() [0x00007f9c6af9f000]
        java.lang.Thread.State: TIMED_WAITING (on object monitor)
             at java.lang.Object.wait(Native Method)
             at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
             - locked <0x00000000f66ac718> (a
org.apache.mina.core.future.De
     <http://org.apache.mina.core.future.De>faultIoFuture)
             at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
             at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
             at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
             - locked <0x00000000f66ac750> (a java.lang.Object)
             at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
             at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
             at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
             at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
             at java.lang.Thread.run(Thread.java:748)

     It does not happen very often: about 5% of the connection
attempts leave a NioSocketConnector
     hanging.
     It only seems to happen though when the connection is
disconnected by
     "javax.net.ssl.SSLHandshakeException: SSL handshake failed".
Although there are cases when
     there is no leak even on an SSLHandshakeException.
     If the connection was reset "normally" by "java.io.IOException:
Connection reset by peer" then
     the leak does not seem to occur. It also does not occur when the
connection is refused right away.

     Since this seems to be related to SSL connections: is there
something that we need to take
     care of when using the SSL filter?

     The code for the IoSessionInitiator can be found here:
https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java <https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java>
     I have added some comments in this gist (starting with "chrjohn"):
     https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b
     <https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b>

     I cannot rule out that we might be doing something wrong here, so
any pointer is appreciated. :)

     Thanks in advance for your help and best regards,
     Chris.

     --     Christoph John
     Development & Support
     Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
     Mailto:christoph.j...@macd.com <mailto:christoph.j...@macd.com>



     http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------
     MACD GmbH
     Oppenhoffallee 103
     D-52066 Aachen
     Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> | Fax: +49 241
557080-10
     <tel:%2B49%20241%20557080-10>
              Amtsgericht Aachen: HRB 8151
     Ust.-Id: DE 813021663

     Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------

     take care of the environment - print only if necessary



--
Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:christoph.j...@macd.com
        


http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
        
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
         Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663

Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
        
----------------------------------------------------------------------------------------------------

take care of the environment - print only if necessary

Reply via email to