[jira] [Created] (DIRMINA-915) 404 Not Found page displayed when accessing a lot of the Apache MINA links
Victor Baxan created DIRMINA-915: Summary: 404 Not Found page displayed when accessing a lot of the Apache MINA links Key: DIRMINA-915 URL: https://issues.apache.org/jira/browse/DIRMINA-915 Project: MINA Issue Type: Bug Components: Web Site / Documentation Affects Versions: 2.0.7 Reporter: Victor Baxan Priority: Critical Following from the same issue filed on StackOverflow (http://stackoverflow.com/questions/13412818/404-not-found-page-displayed-when-accessing-a-lot-of-the-apache-mina-links), it has been stated about a day ago that the issue is fixed and this should be visible after updating the mirrors within a couple of hours time. The issue is still reproducible up until now though. Could you please look into this again? Am I missing something? Thanks, Victor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DIRMINA-645) SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd()
[ https://issues.apache.org/jira/browse/DIRMINA-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467527#comment-13467527 ] Victor N commented on DIRMINA-645: -- Currently I have done a workaround for StartTLS using Reflection: class PatchedSslFilter extends SslFilter { private Method method; public PatchedSslFilter(SSLContext sslContext) { super(sslContext, true); try { method = SslFilter.class.getDeclaredMethod(initiateHandshake, NextFilter.class, IoSession.class); method.setAccessible(true); } catch (Exception ex) { throw new RuntimeException(ex); } } @Override public void onPostAdd(IoFilterChain parent, String name, NextFilter nextFilter) throws SSLException { try { method.invoke(this, nextFilter, parent.getSession()); } catch (Exception ex) { throw new RuntimeException(ex); } } } I think a better way is to make SSLFilter more flexible and support both cases - start in 'session created' and 'onPostAdd'. In fact, there is 'autoStart' flag already there. Also, 'initiateHandshake' method could be protected, not private. SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd() - Key: DIRMINA-645 URL: https://issues.apache.org/jira/browse/DIRMINA-645 Project: MINA Issue Type: Improvement Components: Filter Affects Versions: 2.0.0-M3 Reporter: Dan Mihai Dumitriu Fix For: 2.0.5 Original Estimate: 1h Remaining Estimate: 1h Here's the situation I needed to get working. We want to make a secure connection through a SOCKS 5 proxy. So, one would think that just using the SslFilter and and the ProxyConnector (which adds the ProxyFilter at the bottom of the stack) would just do it. Unfortunately, it does not work quite right out of the box. The ProxyFilter only fully initializes itself after the sessionCreated() method is called. Meanwhile, the SslFilter tries to start the handshake (i.e. calls initiateHandshake()) from the onPostAdd() method, which occurs before the sessionCreated() is called. Moving the initiateHandshake() from onPostAdd() to sessionCreated() in SslFilter, as shown below, seems to fix the problem. @Override public void onPostAdd(IoFilterChain parent, String name, NextFilter nextFilter) throws SSLException { //if (autoStart) { //initiateHandshake(nextFilter, parent.getSession()); //} } @Override public void sessionCreated(NextFilter nextFilter, IoSession session) throws Exception { super.sessionCreated(nextFilter, session); if (autoStart) { initiateHandshake(nextFilter, session); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DIRMINA-645) SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd()
[ https://issues.apache.org/jira/browse/DIRMINA-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463684#comment-13463684 ] Victor N commented on DIRMINA-645: -- I'm not sure that it works if the SslFilter is not in the chain when the server is started This is exactly my case, it is typical for StartTLS cases to create SSLFiler only if needed. I have tested in 2.0.5 - it does not work (calling startTLS with or without autostart flag does not help). SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd() - Key: DIRMINA-645 URL: https://issues.apache.org/jira/browse/DIRMINA-645 Project: MINA Issue Type: Improvement Components: Filter Affects Versions: 2.0.0-M3 Reporter: Dan Mihai Dumitriu Fix For: 2.0.5 Original Estimate: 1h Remaining Estimate: 1h Here's the situation I needed to get working. We want to make a secure connection through a SOCKS 5 proxy. So, one would think that just using the SslFilter and and the ProxyConnector (which adds the ProxyFilter at the bottom of the stack) would just do it. Unfortunately, it does not work quite right out of the box. The ProxyFilter only fully initializes itself after the sessionCreated() method is called. Meanwhile, the SslFilter tries to start the handshake (i.e. calls initiateHandshake()) from the onPostAdd() method, which occurs before the sessionCreated() is called. Moving the initiateHandshake() from onPostAdd() to sessionCreated() in SslFilter, as shown below, seems to fix the problem. @Override public void onPostAdd(IoFilterChain parent, String name, NextFilter nextFilter) throws SSLException { //if (autoStart) { //initiateHandshake(nextFilter, parent.getSession()); //} } @Override public void sessionCreated(NextFilter nextFilter, IoSession session) throws Exception { super.sessionCreated(nextFilter, session); if (autoStart) { initiateHandshake(nextFilter, session); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DIRMINA-645) SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd()
[ https://issues.apache.org/jira/browse/DIRMINA-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462733#comment-13462733 ] Victor N commented on DIRMINA-645: -- Hi Emmanuel, how to use StartTLS with the changed SslFilter? 'session created' event will be missed (session was already started), 'onPostAdd' should be used in this case. I was trying to override SslFilter.onPostAdd() in MySslFilter class, but I can not call _private_ method 'initiateHandshake'. SslFilter should start initiating handshake from sesionCreated() rather than from onPostAdd() - Key: DIRMINA-645 URL: https://issues.apache.org/jira/browse/DIRMINA-645 Project: MINA Issue Type: Improvement Components: Filter Affects Versions: 2.0.0-M3 Reporter: Dan Mihai Dumitriu Fix For: 2.0.5 Original Estimate: 1h Remaining Estimate: 1h Here's the situation I needed to get working. We want to make a secure connection through a SOCKS 5 proxy. So, one would think that just using the SslFilter and and the ProxyConnector (which adds the ProxyFilter at the bottom of the stack) would just do it. Unfortunately, it does not work quite right out of the box. The ProxyFilter only fully initializes itself after the sessionCreated() method is called. Meanwhile, the SslFilter tries to start the handshake (i.e. calls initiateHandshake()) from the onPostAdd() method, which occurs before the sessionCreated() is called. Moving the initiateHandshake() from onPostAdd() to sessionCreated() in SslFilter, as shown below, seems to fix the problem. @Override public void onPostAdd(IoFilterChain parent, String name, NextFilter nextFilter) throws SSLException { //if (autoStart) { //initiateHandshake(nextFilter, parent.getSession()); //} } @Override public void sessionCreated(NextFilter nextFilter, IoSession session) throws Exception { super.sessionCreated(nextFilter, session); if (autoStart) { initiateHandshake(nextFilter, session); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ASYNCWEB-39) Parsing of fragmented request/response data is broken with MINA-2.0.0 or later
[ https://issues.apache.org/jira/browse/ASYNCWEB-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164707#comment-13164707 ] Victor N commented on ASYNCWEB-39: -- Victor, thank you for the patch! It works, I tested both with without connection pooling. chunked requests also work. P.S. mina 2.0.2, asyncweb 2.x (taken from trunk some months ago) Parsing of fragmented request/response data is broken with MINA-2.0.0 or later -- Key: ASYNCWEB-39 URL: https://issues.apache.org/jira/browse/ASYNCWEB-39 Project: Asyncweb Issue Type: Bug Components: Common Reporter: Victor Antonovich Attachments: HttpCodecFactory_MINA-2.0.0_fix.patch MINA-2.0.0 and later versions have DIRMINA-749 fix ( https://issues.apache.org/jira/browse/DIRMINA-749 ), which broke AsyncWeb request/response parsing in case of fragmented network data. HttpCodecFactory class doesn't care about session decoders tracking, so each next network data fragment in session is parsed with new instance of decoder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (DIRMINA-827) NioSocketConnector leaks the last open NioSocketSession after close
[ https://issues.apache.org/jira/browse/DIRMINA-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018196#comment-13018196 ] Victor N commented on DIRMINA-827: -- Can confirm that the problem exists - we are using connector too. NioSocketConnector leaks the last open NioSocketSession after close --- Key: DIRMINA-827 URL: https://issues.apache.org/jira/browse/DIRMINA-827 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.2 Environment: Windows 7/Solaris SPARC/Solaris x86 Java versions 6u18 6u24 Reporter: Matt Yates Priority: Critical Labels: connector, mina, session Attachments: MinaMain-2011-04-10.hprof, sessionGcRoot.jpg My company's program uses MINA to make multiple (usually) simultaneous connections to various other machines, while reusing the same NioSocketConnector for each new connection. For better or worse, we store various objects in the IoSession's attributes, which we expect to get released on close. This is not always the case, however, as said session remains in memory until either a new connection is made or the IoConnector is disposed. After writing the simplest Connector program I could (I have several servers available on my network, so I did not write a matching Acceptor), and performing some profiling and debugging, I was able to confirm the leak and identify the issue. Below is my Connector test program {code} public class MinaMain { private static final Logger LOGGER = LoggerFactory.getLogger(MinaMain.class); public static void main(String[] args) throws InterruptedException { LOGGER.trace(Waiting for YourKit to start); Thread.sleep(15000); NioSocketConnector connector = new NioSocketConnector(); connector.setHandler(new IoHandlerAdapter()); closeSession(getConnectedSession(connector)); //LOGGER.info(Creating and closing 5 sessions in series); // //for (int x = 0; x 5; x++) { //IoSession session = getConnectedSession(connector); // //if (session == null) { //continue; //} // //closeSession(session); //} // //LOGGER.info(Creating 5 sessions and then closing 5 sessions); //IoSession[] sessions = new IoSession[5]; // //for (int x = 0; x 5; x++) { //sessions[x] = getConnectedSession(connector); //} // //for (int x = 0; x 5; x++) { //IoSession session = sessions[x]; // //if (session != null) { //closeSession(session); //sessions[x] = null; //} //} LOGGER.info(Test complete. Sleeping for 60s); Thread.sleep(6); } private static IoSession getConnectedSession(IoConnector connector) throws InterruptedException { IoSession session = null; try { ConnectFuture future = connector.connect(new InetSocketAddress(134.64.37.183, 11109)); future.addListener(new IoFutureListenerConnectFuture() { @Override public void operationComplete(ConnectFuture future) { LOGGER.debug(Connection completed callback for session + future.getSession().getId()); } }); future.awaitUninterruptibly(); LOGGER.debug(Connection created for session + future.getSession().getId()); session = future.getSession(); Thread.sleep(15000); } catch (Exception e) { LOGGER.error(Failed to connect, e); } return session; } private static void closeSession(IoSession session) throws InterruptedException { try { CloseFuture closeFuture = session.getCloseFuture(); closeFuture.addListener(new IoFutureListenerCloseFuture() { @Override public void operationComplete(CloseFuture future) { LOGGER.debug(Session closed callback for session + future.getSession().getId()); } }); LOGGER.debug(Attempting to close session + session.getId()); session.close(false); LOGGER.debug(IoSession.close(false) returned. Awaiting uninterruptibily for session + session.getId()); closeFuture.awaitUninterruptibly(); LOGGER.debug(Close completed for session + session.getId()); } catch (Exception e) { LOGGER.error(Failed to close session + session.getId()); session.close(true); } Thread.sleep(15000); } } {code} Attached
[jira] [Commented] (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018197#comment-13018197 ] Victor N commented on DIRMINA-678: -- Seems I have never seen the problem in JDK 1.6.0_22 and mina 2.0.2 - we use it in production at Linux servers (12 servers) during several months. NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.3 Attachments: mina-2.0.3.diff, snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (ASYNCWEB-39) Parsing of fragmented request/response data is broken with MINA-2.0.0 or later
Parsing of fragmented request/response data is broken with MINA-2.0.0 or later -- Key: ASYNCWEB-39 URL: https://issues.apache.org/jira/browse/ASYNCWEB-39 Project: Asyncweb Issue Type: Bug Components: Common Reporter: Victor Antonovich MINA-2.0.0 and later versions have DIRMINA-749 fix ( https://issues.apache.org/jira/browse/DIRMINA-749 ), which broke AsyncWeb request/response parsing in case of fragmented network data. HttpCodecFactory class doesn't care about session decoders tracking, so each next network data fragment in session is parsed with new instance of decoder. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ASYNCWEB-39) Parsing of fragmented request/response data is broken with MINA-2.0.0 or later
[ https://issues.apache.org/jira/browse/ASYNCWEB-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victor Antonovich updated ASYNCWEB-39: -- Attachment: HttpCodecFactory_MINA-2.0.0_fix.patch Proposed patch. Parsing of fragmented request/response data is broken with MINA-2.0.0 or later -- Key: ASYNCWEB-39 URL: https://issues.apache.org/jira/browse/ASYNCWEB-39 Project: Asyncweb Issue Type: Bug Components: Common Reporter: Victor Antonovich Attachments: HttpCodecFactory_MINA-2.0.0_fix.patch MINA-2.0.0 and later versions have DIRMINA-749 fix ( https://issues.apache.org/jira/browse/DIRMINA-749 ), which broke AsyncWeb request/response parsing in case of fragmented network data. HttpCodecFactory class doesn't care about session decoders tracking, so each next network data fragment in session is parsed with new instance of decoder. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973743#action_12973743 ] Victor N commented on DIRMINA-678: -- Hi Emmanuel, the fixes your are trying to do - do they concert Acceptor only? or Connector too? I am not sure yet, but seems the problem is reproduced in our production system with 2.0.2. I will try to understand whether this is the Selector bug or not. NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.3 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973974#action_12973974 ] Victor N commented on DIRMINA-678: -- I have found a bug in our system, it was endless loop in our code and NOT in mina! We are using JDK 1.6.0_23 now (but it has some strange changes - it is difficult to monitor via jstack or jvisualvm, so maybe we will go back to 1.6.0_22 or 21), I will look how it is working now and whether the bug is reproduced. We use both Acceptors and Connectors and have tens of thousands IoSessions per server. If the problem is fixed in JDK itself (which build?), we do not need any patches of course! Is there any proof/link of this fix in JDK? NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.3 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-29) JMX integration
[ https://issues.apache.org/jira/browse/DIRMINA-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969282#action_12969282 ] Victor N commented on DIRMINA-29: - I would propose to add information about Queue isage in thread pool to JMX monitoring. In fact, I usually patch ExecutorFilter class and add the following method: public int getWorkQueueSize() { try { if (executor instanceof ThreadPoolExecutor) { return ((ThreadPoolExecutor)executor).getQueue().size(); } } catch (Exception ex) { // not a problem here } return -1; } JMX integration --- Key: DIRMINA-29 URL: https://issues.apache.org/jira/browse/DIRMINA-29 Project: MINA Issue Type: New Feature Affects Versions: 0.7.0, 0.8.0, 0.9.0, 1.0.0 Reporter: Trustin Lee Assignee: Julien Vermillard Fix For: 1.0.0 Attachments: ByteBuffer.java, ByteBuffer.java, ByteBuffer.java, ByteBufferManager.java, ByteBufferManagerMBean.java, jconsole.png, minaJMX.tar.gz, SessionManager.java, SessionManager.java, SessionManagerMBean.java, SessionManagerMBean.java, SessionManagerMBean.tar.gz -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-805) No cipher suites and protocols in SslFilter
[ https://issues.apache.org/jira/browse/DIRMINA-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928181#action_12928181 ] Victor N commented on DIRMINA-805: -- I think it would be better to fix it in mina code, so it will work out of the box. No cipher suites and protocols in SslFilter --- Key: DIRMINA-805 URL: https://issues.apache.org/jira/browse/DIRMINA-805 Project: MINA Issue Type: Bug Components: Filter Affects Versions: 2.0.0-M6 Environment: Windows Xp, jdk1.5.0_05 Reporter: Chathura Randika Fix For: 3.0.0-M1 Original Estimate: 48h Remaining Estimate: 48h When try to getEnabledProtocols and getEnabledCipherSuites from SslFilter it returns null. SslFilter allows set setEnabledProtocols and setEnabledCipherSuites. but at the run time it gives the following exceptions. This may caused because of Mina SslFilter doesn't implement the functionality as SSLServerSocket. org.apache.mina.util.DefaultExceptionMonitor exceptionCaught WARNING: Unexpected exception. org.apache.mina.core.filterchain.IoFilterLifeCycleException: onPreAdd(): ssl:SslFilter in (0x0002: nio socket, server, at org.apache.mina.core.filterchain.DefaultIoFilterChain.register(DefaultIoFilterChain.java:279) at org.apache.mina.core.filterchain.DefaultIoFilterChain.addLast(DefaultIoFilterChain.java:174) at org.apache.mina.core.filterchain.DefaultIoFilterChainBuilder.buildFilterChain(DefaultIoFilterChainBuilder.java:452) at org.apache.mina.core.polling.AbstractPollingIoProcessor.addNow(AbstractPollingIoProcessor.java:530) at org.apache.mina.core.polling.AbstractPollingIoProcessor.handleNewSessions(AbstractPollingIoProcessor.java:503) at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$300(AbstractPollingIoProcessor.java:64) at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1109) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64) Caused by: java.lang.IllegalArgumentException: TLSv1.1 at com.sun.net.ssl.internal.ssl.ProtocolVersion.valueOf(ProtocolVersion.java:114) at com.sun.net.ssl.internal.ssl.ProtocolList.init(ProtocolList.java:36) at com.sun.net.ssl.internal.ssl.SSLEngineImpl.setEnabledProtocols(SSLEngineImpl.java:1719) at org.apache.mina.filter.ssl.SslHandler.init(SslHandler.java:170) at org.apache.mina.filter.ssl.SslFilter.onPreAdd(SslFilter.java:417) at org.apache.mina.core.filterchain.DefaultIoFilterChain.register(DefaultIoFilterChain.java:277) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-805) No cipher suites and protocols in SslFilter
[ https://issues.apache.org/jira/browse/DIRMINA-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927853#action_12927853 ] Victor N commented on DIRMINA-805: -- I found a workaround in our code - when adding SSLFilter, do the following: sslFilter.setEnabledCipherSuites(sslContext.getSupportedSSLParameters().getCipherSuites()); (sslContext is javax.net.ssl.SSLContext instance). Or you can prepare an array of suite names manually and pass it without using SSLContext :) No cipher suites and protocols in SslFilter --- Key: DIRMINA-805 URL: https://issues.apache.org/jira/browse/DIRMINA-805 Project: MINA Issue Type: Bug Components: Filter Affects Versions: 2.0.0-M6 Environment: Windows Xp, jdk1.5.0_05 Reporter: Chathura Randika Fix For: 3.0.0-M1 Original Estimate: 48h Remaining Estimate: 48h When try to getEnabledProtocols and getEnabledCipherSuites from SslFilter it returns null. SslFilter allows set setEnabledProtocols and setEnabledCipherSuites. but at the run time it gives the following exceptions. This may caused because of Mina SslFilter doesn't implement the functionality as SSLServerSocket. org.apache.mina.util.DefaultExceptionMonitor exceptionCaught WARNING: Unexpected exception. org.apache.mina.core.filterchain.IoFilterLifeCycleException: onPreAdd(): ssl:SslFilter in (0x0002: nio socket, server, at org.apache.mina.core.filterchain.DefaultIoFilterChain.register(DefaultIoFilterChain.java:279) at org.apache.mina.core.filterchain.DefaultIoFilterChain.addLast(DefaultIoFilterChain.java:174) at org.apache.mina.core.filterchain.DefaultIoFilterChainBuilder.buildFilterChain(DefaultIoFilterChainBuilder.java:452) at org.apache.mina.core.polling.AbstractPollingIoProcessor.addNow(AbstractPollingIoProcessor.java:530) at org.apache.mina.core.polling.AbstractPollingIoProcessor.handleNewSessions(AbstractPollingIoProcessor.java:503) at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$300(AbstractPollingIoProcessor.java:64) at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1109) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64) Caused by: java.lang.IllegalArgumentException: TLSv1.1 at com.sun.net.ssl.internal.ssl.ProtocolVersion.valueOf(ProtocolVersion.java:114) at com.sun.net.ssl.internal.ssl.ProtocolList.init(ProtocolList.java:36) at com.sun.net.ssl.internal.ssl.SSLEngineImpl.setEnabledProtocols(SSLEngineImpl.java:1719) at org.apache.mina.filter.ssl.SslHandler.init(SslHandler.java:170) at org.apache.mina.filter.ssl.SslFilter.onPreAdd(SslFilter.java:417) at org.apache.mina.core.filterchain.DefaultIoFilterChain.register(DefaultIoFilterChain.java:277) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Re : ConnectFuture confusion
I agree with Edouard, much people not even waited but used (!) mina 2.0 in production. Our servers are working on mina 2.0 during ~1.5 years - we use both connectors and acceptors. In fact, 2.0 is not so bad as it seems to be ;) We just have some bugs which should be fixed in 2.0. Otherwise people may think that mina will never be released ;-( Victor N Edouard De Oliveira wrote: There are months i'm thinking the same without daring to say so ... The problem is that 1.x branches have been abandonned long time ago and much people are indeed waiting for the release of this 2.0 how bad would it be to release it for the early adopters with a 'use it at your own risk' warning and invite new comers to wait for a 3.0 preview ? would it be acceptable for the community to say that we won't support it extensively as our efforts will be concentrated on 3.0 ? Maybe it's the right time to shake the anthill or maybe not ... my 2 cents Cordialement, Regards, -Edouard De Oliveira- - Message d'origine De : Emmanuel Lecharny elecha...@gmail.com À : dev@mina.apache.org Envoyé le : Lun 1 Mars 2010, 18 h 45 min 02 s Objet : Re: ConnectFuture confusion On 3/1/10 6:30 PM, Alan D. Cabrera wrote: On Mar 1, 2010, at 9:20 AM, Emmanuel Lecharny wrote: On 3/1/10 6:10 PM, Alan D. Cabrera wrote: On Mar 1, 2010, at 8:04 AM, Emmanuel Lecharny wrote: On 3/1/10 4:38 PM, Alan D. Cabrera wrote: On Feb 26, 2010, at 9:03 AM, Ashish wrote: Thoughts ? Unless it breaks the system, i would say lets not loose our sleep over this. While I share the same opinion about the IoFuture hierarchy as you I have the same sentiments as Ashish. I'm afraid that we might have to fix the issue in 2.0 Trust me, i'm not pleased with this ! Fixing a bug is one thing. Reorganizing a code base a few days after an attempted vote on its initial release is another. I know :/ This is why I created a branch, in a desesperate attempt to get rid of all those futures, instead of doing that in trunk. Now, it was the end of a long and painful week, chasing many bugs in many places, and I was turning in circle. I *wish* we can fix the bug, without having to rewrite this part. Another alternative is to totally abandon 2.x. It was never officially released. Just leave it as it is and work on the new 2.x I'm also considering this option...
Re: Performance increase in MINA 2.0
Good news, Emmanuel! I believe that any project has its own problems (not visible from the first sight!) - mina, netty, grizzly... Maybe in different situations ;) I will stay with mina and will try to help to debug and speed up it wherever possible. For now, our servers do not require high throughput, it is not critical for us. But we are trying to handle more client sessions per server, so one day the throughput will be important for me too! As for architecture - yes, we should clarify and improve it in mina 3.0, but now it's time to finalize mina 2.0 - this version should be reliable enough and fast. After that we can think about additional config params, I/O throttling, etc - maybe in 2.5 ? ;) I am thinking about how we could call 3.0 - maybe, Mina 3.0 Reloaded? ;) Victor N Emmanuel Lecharny wrote: Hi guys, today, I spent some time debugging MINA, and doing some perf tests (minimal ones). I was chasing a bug in the way messages are written back to client. It appeared that in order to send the messageSent event, we had to reset the buffer we just sent (otheriwse the filters have no way to get the content of this buffer). The problem is that the buffer remained reset, and MINA thought that it should be processed again (but not send again, hopefully). By repositionning the buffer to its previous position, I was able to speed up the server by 5%. Not too bad... I'm sure there are many areas where we can improve the server. let's do that whne 2.0 will have been released !
Re: Dealing with potential DDOS with MINA 2.0
Emmanuel, I think it should be implemented in some abstract way, giving us a way to tune it in any way a specific application may desire (I mentioned our sample in Jira - video streaming server capable of reducing frame rate for slow clients). I do not know yet how this abstract IoThrottlingStrategy should look like... But in 90% of situations we could propose a predefined SimpleIoThrottlingStrategy strategy as you propose - i.e. we may configure some limits per IoSession and mina will check them. First, we need to know all places in mina where OOM can occur for fast-writing, slow-reading and frequently-connecting clients. I can remember some of them: - AbstractPollingIoProcessor: newSessions (and maybe) removingSessions, flushingSessions, trafficControllingSessions - AbstractPollingIoAcceptor: (not sure if this may impact) registerQueue, cancelQueue, boundHandles - AbstractIoSession: writeRequestQueue - OrderedThreadPoolExecutor: waitingSessions and SessionTasksQueue (DIRMINA-723) Victor N Emmanuel Lecharny wrote: Hi, today we discussed about DIRMINA-764, and about solutions to deal with rogue clients (which are not necessarily malevolent). The problem is that a client which send a lot of messages and does not read the responses fast enough will impact the server in a very ad way : at some point, you'll be hit by a OOM. So the question arose about how to deal with such a situation. there are many things we can control : - number of clients per server - number of message accepted for a client per unit of time - number of message a client can have on the writing queue before we stop accepting new requests - size of message we accept for a client - number of messages in the writing queue - size of messages being processed globally All those parameters (and I may have missed some) have an impact on the server. The problem here is that we are at the limit between configuration and protection. If we decide we accept up to 100 000 clients on a MINA server, then how do we set the other limits? What size should we allowate to handle the load ? Another problem is that if we limit the global number of messages being processed, or the global size, then we will have to select which client we will have to block. Also limitating the writeQueue size might slow down the processing. Right now, in order to avoid a situation where the server simply die, I suggest to implement a very smple strategy on the server : we add a parameter in the session config indicating the macimum number of messages allowed in the writeQueue for a specific session, before this session block new incoming messages. This is easy to implement, and will protect us a bit from fast client but slow readers. We can think more about those typical use cases in MINA 3. thoughts ?
[jira] Commented: (DIRMINA-764) DDOS possible in only a few seconds...
[ https://issues.apache.org/jira/browse/DIRMINA-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833804#action_12833804 ] Victor N commented on DIRMINA-764: -- Emmanuel, are you clients in this test fast enough to read at the speed proposed by the server? Also, is the network between the server and the client fast enough? Maybe read buffer is too small in the client? I do not see it configured in the stress client. I would say that it is typical - when some server is writing too quickly into a socket, so that some client can not read at this speed, the server will die in OutOfMemory :) You need to throttle/limit the write speed somehow. As I know, in mina, writeRequestQueue is unlimited in IoSession :( DDOS possible in only a few seconds... -- Key: DIRMINA-764 URL: https://issues.apache.org/jira/browse/DIRMINA-764 Project: MINA Issue Type: Bug Affects Versions: 2.0.0-RC1 Reporter: Emmanuel Lecharny Priority: Blocker Fix For: 2.0.0 Attachments: screenshot-1.jpg, screenshot-2.jpg We can kill a server in just a few seconds using the stress test found in DIRMINA-762. If we inject messages with no delay, using 50 threads to do that, the ProtocolCodecFilter$MessageWriteRequest is stuffed with hundred of thousands messages waiting to be written back to the client, with no success. On the client side, we receive almost no messages : 0 messages/sec (total messages received 1) 2 messages/sec (total messages received 11) 8 messages/sec (total messages received 55) 8 messages/sec (total messages received 95) 9 messages/sec (total messages received 144) 3 messages/sec (total messages received 162) 1 messages/sec (total messages received 169) ... On the server side, the memory is totally swamped in 20 seconds, with no way to recover : Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space (see graph attached) On the server, ConcurrentLinkedQueue contain the messages to be written (in my case, 724 499 Node are present). There are also 361629 DefaultWriteRequests, 361628 DefaultWriteFutures, 361625 SimpleBuffer, 361 618 ProtocolCodecFilter$MessageWriteRequest and 361 614 ProtocolCodecFilter$EncodedWriteRequests. That mean we don't flush them to the client at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-764) DDOS possible in only a few seconds...
[ https://issues.apache.org/jira/browse/DIRMINA-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833869#action_12833869 ] Victor N commented on DIRMINA-764: -- I am not 100% sure but IMHO when you run stress clients and the server on the same host, so the CPU and I/O activity are high, there may be troubles in testing. I would propose to run the same test in LAN environment - all clients on a separate machine or even multiple machines. As for TCP buffers, they do not depend on how you use your socket - via blocking or non-blocking I/O, locally or remotely. If your client works slowly (under high load on your computer), it will read slowly; in addition, if it has a small TCP buffer for reading - the whole process of TCP transmission is stalled, the server will not send to socket anymore (remember how the congestion control algorithm in TCP works?) Of course, maybe this is not the case in your test, so it would be useful to compare with another mina build before you start digging into the code ;) DDOS possible in only a few seconds... -- Key: DIRMINA-764 URL: https://issues.apache.org/jira/browse/DIRMINA-764 Project: MINA Issue Type: Bug Affects Versions: 2.0.0-RC1 Reporter: Emmanuel Lecharny Priority: Blocker Fix For: 2.0.0 Attachments: screenshot-1.jpg, screenshot-2.jpg We can kill a server in just a few seconds using the stress test found in DIRMINA-762. If we inject messages with no delay, using 50 threads to do that, the ProtocolCodecFilter$MessageWriteRequest is stuffed with hundred of thousands messages waiting to be written back to the client, with no success. On the client side, we receive almost no messages : 0 messages/sec (total messages received 1) 2 messages/sec (total messages received 11) 8 messages/sec (total messages received 55) 8 messages/sec (total messages received 95) 9 messages/sec (total messages received 144) 3 messages/sec (total messages received 162) 1 messages/sec (total messages received 169) ... On the server side, the memory is totally swamped in 20 seconds, with no way to recover : Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space (see graph attached) On the server, ConcurrentLinkedQueue contain the messages to be written (in my case, 724 499 Node are present). There are also 361629 DefaultWriteRequests, 361628 DefaultWriteFutures, 361625 SimpleBuffer, 361 618 ProtocolCodecFilter$MessageWriteRequest and 361 614 ProtocolCodecFilter$EncodedWriteRequests. That mean we don't flush them to the client at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-764) DDOS possible in only a few seconds...
[ https://issues.apache.org/jira/browse/DIRMINA-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833888#action_12833888 ] Victor N commented on DIRMINA-764: -- a malevolent client can kill a mina server in a matter of seconds. This has to be fixed. In fact, this is not mina-specific problem, this is more common in network world. But I agree, we should propose some solutions, ex.: 1) writeRequestQueue may be bounded - somewhere we could configure its size and a policy what to do when the queue is full (like in Executors) 2) some kind of write throttling (optionally) - as I remember, mina already has IoEventQueueThrottle class, but I never used it and I do not know if it is up-to-date If some client (an IoSession) is slow, that is there are many events waiting for socket write, it is server application's responsibility to decide what to do - ignore new events, send some kind of warning to client (hay, mister, you network is too slow, you risk to be disconnected!), maybe event client disconnection after some time, etc. If client and server can negotiate in this situation, everything will work well. We did something like this for Flash clients using Red5 server (based on mina) - we checked writeRequestQueue (or calculated the number or pending write request, maybe) and tuned frame rate of video stream; sometimes we sent a warning to client :) Of course, there may bebad clients trying to do DDOS - this way we can also handle such situations. DDOS possible in only a few seconds... -- Key: DIRMINA-764 URL: https://issues.apache.org/jira/browse/DIRMINA-764 Project: MINA Issue Type: Bug Affects Versions: 2.0.0-RC1 Reporter: Emmanuel Lecharny Priority: Blocker Fix For: 2.0.0 Attachments: screenshot-1.jpg, screenshot-2.jpg We can kill a server in just a few seconds using the stress test found in DIRMINA-762. If we inject messages with no delay, using 50 threads to do that, the ProtocolCodecFilter$MessageWriteRequest is stuffed with hundred of thousands messages waiting to be written back to the client, with no success. On the client side, we receive almost no messages : 0 messages/sec (total messages received 1) 2 messages/sec (total messages received 11) 8 messages/sec (total messages received 55) 8 messages/sec (total messages received 95) 9 messages/sec (total messages received 144) 3 messages/sec (total messages received 162) 1 messages/sec (total messages received 169) ... On the server side, the memory is totally swamped in 20 seconds, with no way to recover : Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space (see graph attached) On the server, ConcurrentLinkedQueue contain the messages to be written (in my case, 724 499 Node are present). There are also 361629 DefaultWriteRequests, 361628 DefaultWriteFutures, 361625 SimpleBuffer, 361 618 ProtocolCodecFilter$MessageWriteRequest and 361 614 ProtocolCodecFilter$EncodedWriteRequests. That mean we don't flush them to the client at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-764) DDOS possible in only a few seconds...
[ https://issues.apache.org/jira/browse/DIRMINA-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833903#action_12833903 ] Victor N commented on DIRMINA-764: -- I found on Netty's documentation page: # No more OutOfMemoryError due to fast, slow or overloaded connection. # No more unfair read / write ratio often found in a NIO application under high speed network This is what we should implement in mina 2.0 - protect ourselves from clients writing too quickly or reading too slowly. Emmanuel, seems that unfair read / write ratio is what you have seen in your test! DDOS possible in only a few seconds... -- Key: DIRMINA-764 URL: https://issues.apache.org/jira/browse/DIRMINA-764 Project: MINA Issue Type: Bug Affects Versions: 2.0.0-RC1 Reporter: Emmanuel Lecharny Priority: Blocker Fix For: 2.0.0 Attachments: screenshot-1.jpg, screenshot-2.jpg We can kill a server in just a few seconds using the stress test found in DIRMINA-762. If we inject messages with no delay, using 50 threads to do that, the ProtocolCodecFilter$MessageWriteRequest is stuffed with hundred of thousands messages waiting to be written back to the client, with no success. On the client side, we receive almost no messages : 0 messages/sec (total messages received 1) 2 messages/sec (total messages received 11) 8 messages/sec (total messages received 55) 8 messages/sec (total messages received 95) 9 messages/sec (total messages received 144) 3 messages/sec (total messages received 162) 1 messages/sec (total messages received 169) ... On the server side, the memory is totally swamped in 20 seconds, with no way to recover : Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space (see graph attached) On the server, ConcurrentLinkedQueue contain the messages to be written (in my case, 724 499 Node are present). There are also 361629 DefaultWriteRequests, 361628 DefaultWriteFutures, 361625 SimpleBuffer, 361 618 ProtocolCodecFilter$MessageWriteRequest and 361 614 ProtocolCodecFilter$EncodedWriteRequests. That mean we don't flush them to the client at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831370#action_12831370 ] Victor N commented on DIRMINA-678: -- Is there a confirmation that this issue was fixed in JDK ? I agree, this fix can be optional and/or check the operating system and JDK version. NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.0-RC2 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831388#action_12831388 ] Victor N commented on DIRMINA-678: -- Sergey, the patch your are talking about - can it be shared here or is it still for testers only? It is almost 1 year old ;) Did you send you feedback to Sun? Maybe you (or someone of mina developers) could ask Sun about posting the patch here or just asking when this bug-fix will be publicly available? Also, we could try to look into open jdk - maybe, the patch is already there ;) Also, it is interesting how the 2 bugs correlate with each other: http://bugs.sun.com/view_bug.do?bug_id=6693490 http://bugs.sun.com/view_bug.do?bug_id=6670302 NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.0-RC2 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (DIRMINA-762) WARN org.apache.mina.core.service.IoProcessor - Create a new selector. Selected is 0, delta = 0
[I tried to add this to Jira, but Jira throws me an error :) ] Emmanuel, I remember a bug where ConcurrentLinkedQueue$Node was the main actor :) It is DIRMINA-709. When I investigated it, I have seen that GC was busy all the time and there were tens of millions of ConcurrentLinkedQueue$Node objects, they were allocated and released frequently. I tried to profile our server with YourKit profiler... without success because of high load (it was in production). Then I prepared my own profiling tool for this concrete problem. It uses AspectJ - I have added an aspect for Queue.offer() and Collection.add() method executions and grabbed most popular stack-traces from where these methods were called. If necessary, I can share my tool here. Emmanuel Lecharny (JIRA) wrote: [ https://issues.apache.org/jira/browse/DIRMINA-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Lecharny updated DIRMINA-762: -- Attachment: Screen shot 2010-02-02 at 7.49.18 PM.png Screen shot 2010-02-02 at 7.49.13 PM.png Here are 5 snapshots : 3 for the server 2 for the client They expose : - Memory + CPU over time - Threads at the end of the test (green = running) - and allocated objects on the server One thing is strange : after having sent around 350 000 messages, the client and the server suddenly slow down. CPU which topped at 85% on my system goes down to 30%, and this is what we can see on both graphs. Also we have an giant number of ConcurrentLinkedQueue$Node objects, which get garbage collected later. I guess that at some point, we are simply pushing too many messages in the queues. WARN org.apache.mina.core.service.IoProcessor - Create a new selector. Selected is 0, delta = 0 Key: DIRMINA-762 URL: https://issues.apache.org/jira/browse/DIRMINA-762 Project: MINA Issue Type: Bug Environment: Linux (2.6.26-2-amd64), java version 1.6.0_12 and also 1.6.0_18. Reporter: Omry Yadan Priority: Critical Fix For: 2.0.0-RC2 Attachments: BufferCodec.java, Screen shot 2010-02-02 at 7.48.39 PM.png, Screen shot 2010-02-02 at 7.48.46 PM.png, Screen shot 2010-02-02 at 7.48.59 PM.png, Screen shot 2010-02-02 at 7.49.13 PM.png, Screen shot 2010-02-02 at 7.49.18 PM.png, Server.java, StressClient.java Mina server gets into a bad state where it constantly prints : WARN org.apache.mina.core.service.IoProcessor - Create a new selector. Selected is 0, delta = 0 when this happens, server throughput drops significantly. to reproduce run the attached server and client for a short while (30 seconds on my box).
Re: [jira] Updated: (DIRMINA-762) WARN org.apache.mina.core.service.IoProcessor - Create a new selector. Selected is 0, delta = 0
Emmanuel, of course, AspectJ is not good for every day's monitoring in production; I used it as a one-time debugging tool because the bug was critical and reproduced only in production. I looked at our server behavior and turned on the tool (via JMX) for several seconds (when the bug occured) and then turned it off. All the stats (stack-traces, etc.) were collected in log files. But maybe this is not needed in your case because you can reproduce it with tests and use a profiler. Victor Emmanuel Lecharny wrote: Hi Victor, Victor a écrit : [I tried to add this to Jira, but Jira throws me an error :) ] Hmm, Jira is tired this morning. I also got some errors. You have to insist. Emmanuel, I remember a bug where ConcurrentLinkedQueue$Node was the main actor :) It is DIRMINA-709. When I investigated it, I have seen that GC was busy all the time and there were tens of millions of ConcurrentLinkedQueue$Node objects, they were allocated and released frequently. I tried to profile our server with YourKit profiler... without success because of high load (it was in production). I see the CLQ$Node objects accumulating on the test, but they get garbage collected when GC kick in. However, I don't know why they are present, as they should have been removed as soon as they have been handled. Then I prepared my own profiling tool for this concrete problem. It uses AspectJ - I have added an aspect for Queue.offer() and Collection.add() method executions and grabbed most popular stack-traces from where these methods were called. If necessary, I can share my tool here. I'm not sure we will go with Aspect-J in MINA, but I'm wondering if those are not good candidates for JMX counters. Anyway, DIRMINA-762 seems to me a different beast. Further investigation I have done last evening were quite interesting and puzzling too: - after a while running the client, even if it's an infinite loop, it looks like only 3 threads receive data when all the 61 others are just doing nothing. It's like they are dead, but in RUNABLE state ! - another interesting thing : as I only have 3 NioProcessor to process all the load, I have added an executorFilter in the chain, and what I see is absolutely scarry : every time you launch some new clients, as many threads are created on the server *and never removed or reused*. Even if you stop the clients. It's like those threads are dead and useless. Ok, I may need some coffee here, I have to rerun the tests now that I got some sleep, but I find those things a bit annoying. I will investigate more today.
[jira] Commented: (DIRMINA-682) We need a better documentation for the ExecutorFilter [was :Writing more than one message will block until the MessageReceived as been fully proceced]
[ https://issues.apache.org/jira/browse/DIRMINA-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829179#action_12829179 ] Victor N commented on DIRMINA-682: -- I agree, it would be great to document ExecutorFilter and especially OrderedThreadPool in more details! In the test (2) - what was your filter chain? Did not you use the same OrderedThreadPoolExecutor for both MESSAGE_RECEIVED and WRITE operations? If I understand correctly, this could bring to a deadlock because we were already running MESSAGE_RECEIVED and tried to do WRITE. We need a better documentation for the ExecutorFilter [was :Writing more than one message will block until the MessageReceived as been fully proceced] -- Key: DIRMINA-682 URL: https://issues.apache.org/jira/browse/DIRMINA-682 Project: MINA Issue Type: Improvement Affects Versions: 2.0.0-M4 Reporter: Emmanuel Lecharny Priority: Critical Fix For: 2.0.0-RC2 When a message generates mor ethan one response, then the written responses will be sent to the client only when the initial message has been totally processed. Suppose that we receive one message M, it will be handled by a IoProcessor in the process() method, go through the chain to the IoHandler.MessageReceive() method. Now, if one want to write more than one response (session.write( R )), then those responses will be enqueued until we are back to the process() method. The issue is due to the fact that the write is done using the IoProcessor associated to the current session, leading to a problem : we can't ask the IoProcessor instance to unqueue the written message until it is done with the current processing( it's running in one single thread). The consequences are painfull : - if one tries to write two responses, waiting for the first responses to be written, this will end with a DeadLock, as we are waiting on the processor we are holding - if we don't care about waiting for the write to be done, then all the responses will be enqueued and stored in memory, until the IoProcessor exit from the read processing and start processing the writes, leading to OOM Exception One solution would be to have to sets of IoProcessors, one for the read, and one for the writes. Or to pick a random Processor to process the writes, as soon as the processor is not the same as the one processing the reads. Here is a sample exhibiting the problem. Just launch it, and use 'telnet localhost 8080' in a console, type something, it should write twice the typed message, but it just generates an exception - see further - and write back the message once. Removing the wait will work, but the messages will be sent only when the read has been processed in the AbstractPollingIoProcessor.process(T session) method : /** * Deal with session ready for the read or write operations, or both. */ private void process(T session) { // Process Reads if (isReadable(session) !session.isReadSuspended()) { read(session); } // Process writes if (isWritable(session) !session.isWriteSuspended()) { scheduleFlush(session); } } The sample code : package org.apache.mina.real.life; import java.net.InetSocketAddress; import org.apache.mina.core.buffer.IoBuffer; import org.apache.mina.core.future.WriteFuture; import org.apache.mina.core.service.IoHandlerAdapter; import org.apache.mina.core.session.IoSession; import org.apache.mina.filter.logging.LoggingFilter; import org.apache.mina.transport.socket.SocketAcceptor; import org.apache.mina.transport.socket.nio.NioSocketAcceptor; /** * (bEntry point/b) Echo server * * @author The Apache MINA Project (dev@mina.apache.org) */ public class Main { private static class EchoProtocolHandler extends IoHandlerAdapter { public void messageReceived(IoSession session, Object message) throws Exception { System.out.println(new String(((IoBuffer)message).array())); // Write the received data back to remote peer WriteFuture wf = session.write(((IoBuffer) message).duplicate()); // Here, we will get a Deadlock detection wf.awaitUninterruptibly(); // Do a second write session.write(((IoBuffer) message).duplicate()); } } /** Choose your favorite port number. */ private static final int PORT = 8080; public static void main(String[] args) throws Exception { SocketAcceptor acceptor = new NioSocketAcceptor(); // Add a logging filter
Re: [MINA 3.0] Thoughts on IoAcceptors
Interesting tool, thanks! But I did not find how to draw UML sequence diagrams... Victor N Emmanuel Lcharny wrote: Ashish a écrit : Emm ! Which tool have you used to create the Class Diagrams? Yed. http://www.yworks.com/en/products_yed_about.html Free and great
Re: FYI: epoll issue fixed in b54
Hello Jean Francois, seems JDK 6 update 16 is there, is epoll bug fixed in it? Victor N Jeanfrancois Arcand wrote: Salut, Emmanuel Lecharny wrote: Jeanfrancois Arcand wrote: Salut, The fix for epoll File exists and spinning Selector issue this community has experienced will be available with jdk7 build 54. I recommend you test it and report any failure as this fix will be backported to JDK 6 eventually (no idea which version...will report as soon as I know). Great ! Nothing expected for Java 5 ? No, unfortunately. The fix will be available in 6u16 and no more lower version. A+ -- Jeanfrancois
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12734021#action_12734021 ] Victor N commented on DIRMINA-709: -- Edouard, your patch works well, but ClosedChannelException still occurs. It is not critical anyway. PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N Assignee: Edouard De Oliveira Fix For: 2.0.0-M7 (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-642) Disposing an IoConnector blocks forever
[ https://issues.apache.org/jira/browse/DIRMINA-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12726513#action_12726513 ] Victor N commented on DIRMINA-642: -- I can confirm that the problem still exists in mina 2.0 M6. Not sure whether this is related to DIRMINA-632 or not. I see the following stack trace: main prio=10 tid=0x4995c000 nid=0x4a5c waiting on condition [0x407cf000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x2aaab45d9ec0 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963) at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1245) at org.apache.mina.core.service.AbstractIoService.dispose(AbstractIoService.java:305) So we are blocked at ThreadPoolExecutor.awaitTermination. I tried to wait 15 minutes without success - always the same stack trace. For me, the problem occurs when I shut down tomcat. In log file I see that connector continues handling i/o events as if it were not stopped. As a workaround I would propose to add another dispose method to AbstractIoService: void dispose(boolean awaitTermination); This is something like IoSession.close(boolean immediately) Disposing an IoConnector blocks forever --- Key: DIRMINA-642 URL: https://issues.apache.org/jira/browse/DIRMINA-642 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M3 Environment: Linux Reporter: Thomas Berger Assignee: Edouard De Oliveira nHandles in the class Worker (AbstractPollingIoConnector) gets negative: -1 in my case. Then the dispose() method blocks forever as the while loop in this worker class never breaks. Reproducing this error is pretty hard, as it usually happens after several hours. I will do a quick fix and check if nHandles=0. I think it may be related to http://issues.apache.org/jira/browse/DIRMINA-632 as I never block for a write in my app. However, I get some WriteToClosedSessionException. Is it possible that this causes nHandles to become negative? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (DIRMINA-723) OrderedThreadPoolExecutor behavior: configurable queue size, corePoolSize, maximumPoolSize
OrderedThreadPoolExecutor behavior: configurable queue size, corePoolSize, maximumPoolSize -- Key: DIRMINA-723 URL: https://issues.apache.org/jira/browse/DIRMINA-723 Project: MINA Issue Type: Improvement Components: Core Affects Versions: 2.0.0-M6 Environment: Ubuntu Linux, kernel 2.6.x Reporter: Victor N Priority: Minor The problem was discussed with Emmanuel Lecharny in mail lists: http://www.nabble.com/OrderedThreadPoolExecutor%3A-limited-workQueue-td24275973.html If you compare OrderedThreadPoolExecutor and standard ThreadPoolExecutor, you can see that ThreadPoolExecutor has useful params: - core pool size - maximum pool size - work queue size If you use unbounded thread pools and queues with mina Acceptor or Connector, you may get OutOfMemoryError under critical load because Java creates too many threads. With ThreadPoolExecutor you may limit the number of threads (maximumPoolSize) and use a bounded queue (ex. LinkedBlockingQueue of limited capacity). Unfortunately, this does not work with OrderedThreadPoolExecutor -both waitingSessions and sessionTasksQueue do not allow to configure their size nor pass a different queue implementation. Even though OrderedThreadPoolExecutor extends ThreadPoolExecutor, it overrides the behavior significantly - seems that its meaning of corePoolSize and maximumPoolSize is different. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
OrderedThreadPoolExecutor: limited workQueue
Hello, I am using MINA 2.0 M6; just wonder if there any way to use LinkedBlockingQueue of limited size with OrderedThreadPoolExecutor like in standard ThreadPoolExecutor (I mean workQueue)? Even though OrderedThreadPoolExecutor extends ThreadPoolExecutor, seems it ignores the parent's workQueue (I tried to pass a queue to the parent's constructor). My goal is to limit the number of threads in OrderedThreadPoolExecutor in critical situations (under high load), otherwise new threads are created constantly and I get OutOfMemory. So I think I could configure a small corePoolSize and a big workQueue to minimize CPU usage and context switching. Thanks, Victor N
JMX: show workQueue size in ExecutorFilter / thread pool
Hello, I think it may be useful to add some info about working queue to JMX. Currently, we can see how many threads are working, but can not see queue size. First, I tried to add it to ObjectMBean... without success, but then I added the following method to ExecutorFilter and it works: public int getWorkQueueSize() { if (executor instanceof ThreadPoolExecutor) { return ((ThreadPoolExecutor)executor).getQueue().size(); } return 0; } If you want it working with OrderedThreadPoolExecutor, you should also comment OrderedThreadPoolExecutor.getQueue() method (which throws UnsupportedOperationException). But seems that OrderedThreadPoolExecutor has its specific implementation and ignores workQueue. Victor N
Re: OrderedThreadPoolExecutor: limited workQueue
Emmanuel, no, there is no bug with thread limit! I am talking about workQueue - in a typical ThreadPoolExecutor you can configure a working queue of any size (limited or not) and this queue can be used to minimize the number of running threads in thread pool. Just look at javadoc in ThreadPoolExecutor class in java 6: If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread. Victor N Emmanuel Lecharny wrote: Victor wrote: Hello, I am using MINA 2.0 M6; just wonder if there any way to use LinkedBlockingQueue of limited size with OrderedThreadPoolExecutor like in standard ThreadPoolExecutor (I mean workQueue)? Even though OrderedThreadPoolExecutor extends ThreadPoolExecutor, seems it ignores the parent's workQueue (I tried to pass a queue to the parent's constructor). My goal is to limit the number of threads in OrderedThreadPoolExecutor in critical situations (under high load), otherwise new threads are created constantly and I get OutOfMemory. So I think I could configure a small corePoolSize and a big workQueue to minimize CPU usage and context switching. It's strange, because we have a OrderedThreadPoolExecutor(int maximumPoolSize) constructor, which can be used to limit the pool thread : public OrderedThreadPoolExecutor(int maximumPoolSize) { this(DEFAULT_INITIAL_THREAD_POOL_SIZE, maximumPoolSize, DEFAULT_KEEP_ALIVE, TimeUnit.SECONDS, Executors.defaultThreadFactory(), null); } public OrderedThreadPoolExecutor( int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, ThreadFactory threadFactory, IoEventQueueHandler eventQueueHandler) { super(DEFAULT_INITIAL_THREAD_POOL_SIZE, 1, keepAliveTime, unit, new SynchronousQueueRunnable(), threadFactory, new AbortPolicy()); if (corePoolSize DEFAULT_INITIAL_THREAD_POOL_SIZE) { throw new IllegalArgumentException(corePoolSize: + corePoolSize); } if ((maximumPoolSize == 0) || (maximumPoolSize corePoolSize)) { throw new IllegalArgumentException(maximumPoolSize: + maximumPoolSize); } // Now, we can setup the pool sizes super.setCorePoolSize( corePoolSize ); super.setMaximumPoolSize( maximumPoolSize ); So unless there is a bad bug in the ThreadPoolExecutor class, I don't see how the number of created thread can go above the limit...
Re: OrderedThreadPoolExecutor: limited workQueue
Emmanuel, I looked into OrderedThreadPoolExecutor code in more details - seems we need to limit both waitingSessions queue and sessionTasksQueue which is stored in IoSession. Both queues are unlimited now, so this can lead to OutOfMemory under high load. One more question about corePoolSize and maximumPoolSize - are they used in the same manner as in ThreadPoolExecutor? I mean the following: When a new task is submitted in method ThreadPoolExecutor.execute, and fewer than corePoolSize threads are running, a new thread is created to handle the request, even if other worker threads are idle. If there are more than corePoolSize but less than maximumPoolSize threads running, a new thread will be created only if the queue is full. Especially the 2nd part where the queue is used instead of creating new threads. Victor N Emmanuel Lecharny wrote: Victor wrote: Emmanuel, no, there is no bug with thread limit! I am talking about workQueue - in a typical ThreadPoolExecutor you can configure a working queue of any size (limited or not) and this queue can be used to minimize the number of running threads in thread pool. Just look at javadoc in ThreadPoolExecutor class in java 6: If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread. Sorry, I was focusing on your last sentence : My goal is to *limit the number of threads* in OrderedThreadPoolExecutor in critical situations (under high load), otherwise *new threads are created constantly* and I get OutOfMemory. So you get OOM because the working queue is unbound : then I'm afraid this queue cannot be modified . probably worth a JIRA at this point. Victor N Emmanuel Lecharny wrote: Victor wrote: Hello, I am using MINA 2.0 M6; just wonder if there any way to use LinkedBlockingQueue of limited size with OrderedThreadPoolExecutor like in standard ThreadPoolExecutor (I mean workQueue)? Even though OrderedThreadPoolExecutor extends ThreadPoolExecutor, seems it ignores the parent's workQueue (I tried to pass a queue to the parent's constructor). My goal is to limit the number of threads in OrderedThreadPoolExecutor in critical situations (under high load), otherwise new threads are created constantly and I get OutOfMemory. So I think I could configure a small corePoolSize and a big workQueue to minimize CPU usage and context switching. It's strange, because we have a OrderedThreadPoolExecutor(int maximumPoolSize) constructor, which can be used to limit the pool thread : public OrderedThreadPoolExecutor(int maximumPoolSize) { this(DEFAULT_INITIAL_THREAD_POOL_SIZE, maximumPoolSize, DEFAULT_KEEP_ALIVE, TimeUnit.SECONDS, Executors.defaultThreadFactory(), null); } public OrderedThreadPoolExecutor( int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, ThreadFactory threadFactory, IoEventQueueHandler eventQueueHandler) { super(DEFAULT_INITIAL_THREAD_POOL_SIZE, 1, keepAliveTime, unit, new SynchronousQueueRunnable(), threadFactory, new AbortPolicy()); if (corePoolSize DEFAULT_INITIAL_THREAD_POOL_SIZE) { throw new IllegalArgumentException(corePoolSize: + corePoolSize); } if ((maximumPoolSize == 0) || (maximumPoolSize corePoolSize)) { throw new IllegalArgumentException(maximumPoolSize: + maximumPoolSize); } // Now, we can setup the pool sizes super.setCorePoolSize( corePoolSize ); super.setMaximumPoolSize( maximumPoolSize ); So unless there is a bad bug in the ThreadPoolExecutor class, I don't see how the number of created thread can go above the limit...
Re: OrderedThreadPoolExecutor: limited workQueue
One more question about corePoolSize and maximumPoolSize - are they used in the same manner as in ThreadPoolExecutor? The way the executor is created, those values are just passed to the underlying java class. We don't do anything with those values inside MINA code. As I can see, OrderedThreadPoolExecutor overrides most of parent's methods, even execute() and Worker.run(). That's why I ask. Victor N
Re: OrderedThreadPoolExecutor: limited workQueue
I will post this problem to Jira - we will all think on a solution ;) Victor N Emmanuel Lecharny wrote: Victor wrote: One more question about corePoolSize and maximumPoolSize - are they used in the same manner as in ThreadPoolExecutor? The way the executor is created, those values are just passed to the underlying java class. We don't do anything with those values inside MINA code. As I can see, OrderedThreadPoolExecutor overrides most of parent's methods, even execute() and Worker.run(). That's why I ask. yeah, it's a bit messy :/ When you look at this method : /** * Add a new thread to execute a task, if needed and possible. * It depends on the current pool size. If it's full, we do nothing. */ private void addWorker() { synchronized (workers) { if (workers.size() = super.getMaximumPoolSize()) { return; } . you see that the maximum pool size is the one stored in the inherited class. But it does not make a lot of sense, as we create the worker locally :/ Victor N
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715810#action_12715810 ] Victor N commented on DIRMINA-709: -- Our production servers are working with this fix during ~1 month, everything seems to work. The only side-effect is mentioned in my comment above - sometimes, rarely, I can see this warning in log files, but everything is working. Forgot to say that I disabled MdcInjectionFilterTest when building mina with the patch because this test failed. Are there any plans to review this fix and include it into the next build - maybe M7 ? PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705995#action_12705995 ] Victor N commented on DIRMINA-709: -- If I do a simple workaround - call removeNow() for such sessions - can something go wrong this way? PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705996#action_12705996 ] Victor N commented on DIRMINA-709: -- Seems that my patch is working - I can see that several pending sessions were removed during last 3 days on our servers. I hope that it does not have impact on some other functions of mina :) Anyway, it would be great to see comments from mina creators! PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705997#action_12705997 ] Victor N commented on DIRMINA-709: -- My patch is the following - I have changed AbstractPollingIoProcessor.remove() method this way: switch (state) { case OPEN: case PREPARING: if (removeNow(session)) { removedSessions ++; } break; case CLOSED: // Skip if channel is already closed break; I am not sure whether the problem was for every PREPARING session or not, but the patch seems to work. PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-709) PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem
[ https://issues.apache.org/jira/browse/DIRMINA-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705998#action_12705998 ] Victor N commented on DIRMINA-709: -- The only side-effect I see is that sometimes after remove() method there is an exception: 04/05/2009 14:25:18| WARN | org.apache.mina.util.DefaultExceptionMonitor.exceptionCaught(): Unexpected exception. java.nio.channels.ClosedChannelException at java.nio.channels.spi.AbstractSelectableChannel.configureBlocking(AbstractSelectableChannel.java:252) at org.apache.mina.transport.socket.nio.NioProcessor.init(NioProcessor.java:100) at org.apache.mina.transport.socket.nio.NioProcessor.init(NioProcessor.java:42) at org.apache.mina.core.polling.AbstractPollingIoProcessor.addNow(AbstractPollingIoProcessor.java:417) at org.apache.mina.core.polling.AbstractPollingIoProcessor.add(AbstractPollingIoProcessor.java:403) at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$200(AbstractPollingIoProcessor.java:59) at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:878) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Maybe this is somehow related to my patch, I do not know. Maybe we could think at a better patch. PENDING session is removed and added endlessly -- garbage allocation and high CPU usage problem Key: DIRMINA-709 URL: https://issues.apache.org/jira/browse/DIRMINA-709 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: Debian Linux, kernel 2.6.24 Reporter: Victor N (This problem was discussed in mail lists, I will copy it here). Seems I have found a bug with IoSession - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: IoSession in PREPARING state is not removed from queue
Thanks Emmanuel! I have added it to JIRA: https://issues.apache.org/jira/browse/DIRMINA-709 Victor Emmanuel Lecharny wrote: Hi Victor, first, thanks for your investigation and the proposed solution. Could you create a JIRA with all the elements you provided, so that it don't get lost forever in the mailing list ? We probably will be able to process this later (all of us are quite busy those days), but if it's a JIRA, it won't get forgetten. Many thanks ! 2009/5/4 Victor dreamland_sk...@mail333.com: Seems that my patch is working - I can see that several pending sessions were removed during last 3 days on our servers. I hope that it does not have impact on some other functions of mina :) Anyway, it would be great to see comments from mina creators! Victor Victor wrote: Sorry, I forgot to say that I use mina 2.0 M4 (unfortunately, our server does not work with M5 yet - something changed). If I do a simple workaround - call removeNow() for such sessions - can something go wrong this way? Thanks Victor Victor wrote: Hello mina developers! Seems I have found a bug with IoSession (or I am doing something wrong :) ) - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. Thanks for any ideas and propositions, Victor
Re: IoSession in PREPARING state is not removed from queue
Seems that my patch is working - I can see that several pending sessions were removed during last 3 days on our servers. I hope that it does not have impact on some other functions of mina :) Anyway, it would be great to see comments from mina creators! Victor Victor wrote: Sorry, I forgot to say that I use mina 2.0 M4 (unfortunately, our server does not work with M5 yet - something changed). If I do a simple workaround - call removeNow() for such sessions - can something go wrong this way? Thanks Victor Victor wrote: Hello mina developers! Seems I have found a bug with IoSession (or I am doing something wrong :) ) - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. Thanks for any ideas and propositions, Victor
Re: IoSession in PREPARING state is not removed from queue
Sorry, I forgot to say that I use mina 2.0 M4 (unfortunately, our server does not work with M5 yet - something changed). If I do a simple workaround - call removeNow() for such sessions - can something go wrong this way? Thanks Victor Victor wrote: Hello mina developers! Seems I have found a bug with IoSession (or I am doing something wrong :) ) - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. Thanks for any ideas and propositions, Victor
IoSession in PREPARING state is not removed from queue
Hello mina developers! Seems I have found a bug with IoSession (or I am doing something wrong :) ) - I can see that a PREPARING session is not removed correctly from queue. When some session is in PREPARING state, it is removed from removingSessions queue but right after that it is added to this queue again! So this session is added to the queue and removed from it until forever. As a result, this give us significant garbage allocation, so CPU spends most of time in garbage collector (I can see this is JConsole). I see comments there in AbstractPollingIoProcessor class: private int remove() { ... case PREPARING: // Retry later if session is not yet fully initialized. // (In case that Session.close() is called before addSession() is processed) scheduleRemove(session); return removedSessions; ... } I have added logging to this code, and I can see that the SAME session is removed and added again and again. Can somebody explain this logic please? Why don't we remove the PENDING session? Or maybe is there a workaround for this. Sorry, I can not provide a test for this issue, but it is reproduced almost every day at out production servers under some load. Maybe, you can reproduce it by adding a delay in addSession() and then closing the session during this delay. Thanks for any ideas and propositions, Victor
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12696711#action_12696711 ] Victor N commented on DIRMINA-678: -- Just a question: was this high cpu usage accompanied by quick memory allocation and subsequent garbage collection (which occurs constantly)? I see many ConcurrentLInkedQueue$Node objects (almost all of them are EMPTY queue nodes) in our server which are created and garbage-collected is such cases. Since they are dead (unreferenced) objects, I can not understand where they were used - in mina or not. I will try to get allocation stack traces with a profiler, but this can be difficult on a production system. NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.0 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
[ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12694177#action_12694177 ] Victor N commented on DIRMINA-678: -- I see a similar behavior on our production servers, with the same picture of CPU usage after several hours. But at the same time I can see that our server begins an intensive garbage allocation, so I am not sure yet whether this is the Selector problem or something wrong in our code. I will come back later to clarify the result. This problem of CPU usage occurs in our latest server version where we use NioSocketConnector and MinaSocketAcceptor. In previous versions, we used only NioSocketAcceptor and it worked well for weeks. NioProcessor 100% CPU usage on Linux (epoll selector bug) - Key: DIRMINA-678 URL: https://issues.apache.org/jira/browse/DIRMINA-678 Project: MINA Issue Type: Bug Components: Core Affects Versions: 2.0.0-M4 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions, Reporter: Serge Baranov Fix For: 2.0.0 Attachments: snap973.png, snap974.png It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting MINA in the very similar way. NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running depending on the load (sometimes after several hours) one of the NioProcessor starts to consume all the available CPU resources probably spinning in the epoll select loop. Later, more threads can be affected by the same issue, thus 100% loading all the available CPU cores. Sample trace: NioProcessor-10 [RUNNABLE] CPU time: 5:15 sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) sun.nio.ch.EPollArrayWrapper.poll(long) sun.nio.ch.EPollSelectorImpl.doSelect(long) sun.nio.ch.SelectorImpl.lockAndDoSelect(long) sun.nio.ch.SelectorImpl.select(long) org.apache.mina.transport.socket.nio.NioProcessor.select(long) org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run() org.apache.mina.util.NamePreservingRunnable.run() java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() It seems to affect any NIO based Java server applications running in the specified environment. Some projects provide workarounds for similar JDK bugs, probably MINA can also think about a workaround. As far as I know, there are at least 3 users who experience this issue with Jetty and all of them are running CentOS (some distribution default setting is a trigger?). As for MINA, I'm not aware of similar reports yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Connector: sessionCreated is not called (unstable behavior)
Hello, I have finally found where the problem was. It was in my code: I called ConnectionRequest.cancel() for a connection request in the done state. Now I added the check: if (!connectionRequest.isDone()) { connectionRequest.cancel(); } I think it would be great to add this check to ConnectionRequest class itself. Otherwise mina adds the request to cancelQueue, but in fact there is nothing to cancel - that's why the Connector's run() loop works incorrectly. What do you think? Thanks, Victor Victor wrote: Just a question: why does Connector.run() thread exit when nHandles==0? Why not to continue executing select() in the loop infinitely? Thanks, Victor Victor wrote: Unfortunately, even with the patch in Connector and Processor, the problem is reproduced randomly, most of time everything works, but sometimes I see that sessionCreated is not called. But if I start a new connection, this old pending connection (for which I am waiting for sessionCreated callback) is completed instantly, so I get sessionCreated. Victor N Victor wrote: I have added a similar fix in AbstractPollingIoProcessor.Processor.run() - nSessions is now initialized inside the loop, on each iteration. It seems to work now, I can not reproduce the bug, but I will continue testing. Not sure whether this patch is adequate or nor. Any comments? Victor N Victor wrote: I tried to play with my fix - from the first glance, it seemed to work - connection was established, sessionCreated was called, but at the second attempt the problem reproduced. Any ideas what is wrong here? Thanks, Victor N Victor wrote: Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N
Re: Connector: sessionCreated is not called (unstable behavior)
Just a question: why does Connector.run() thread exit when nHandles==0? Why not to continue executing select() in the loop infinitely? Thanks, Victor Victor wrote: Unfortunately, even with the patch in Connector and Processor, the problem is reproduced randomly, most of time everything works, but sometimes I see that sessionCreated is not called. But if I start a new connection, this old pending connection (for which I am waiting for sessionCreated callback) is completed instantly, so I get sessionCreated. Victor N Victor wrote: I have added a similar fix in AbstractPollingIoProcessor.Processor.run() - nSessions is now initialized inside the loop, on each iteration. It seems to work now, I can not reproduce the bug, but I will continue testing. Not sure whether this patch is adequate or nor. Any comments? Victor N Victor wrote: I tried to play with my fix - from the first glance, it seemed to work - connection was established, sessionCreated was called, but at the second attempt the problem reproduced. Any ideas what is wrong here? Thanks, Victor N Victor wrote: Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N
Re: Connector: sessionCreated is not called (unstable behavior)
I have added a similar fix in AbstractPollingIoProcessor.Processor.run() - nSessions is now initialized inside the loop, on each iteration. It seems to work now, I can not reproduce the bug, but I will continue testing. Not sure whether this patch is adequate or nor. Any comments? Victor N Victor wrote: I tried to play with my fix - from the first glance, it seemed to work - connection was established, sessionCreated was called, but at the second attempt the problem reproduced. Any ideas what is wrong here? Thanks, Victor N Victor wrote: Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N
Re: Connector: sessionCreated is not called (unstable behavior)
Unfortunately, even with the patch in Connector and Processor, the problem is reproduced randomly, most of time everything works, but sometimes I see that sessionCreated is not called. But if I start a new connection, this old pending connection (for which I am waiting for sessionCreated callback) is completed instantly, so I get sessionCreated. Victor N Victor wrote: I have added a similar fix in AbstractPollingIoProcessor.Processor.run() - nSessions is now initialized inside the loop, on each iteration. It seems to work now, I can not reproduce the bug, but I will continue testing. Not sure whether this patch is adequate or nor. Any comments? Victor N Victor wrote: I tried to play with my fix - from the first glance, it seemed to work - connection was established, sessionCreated was called, but at the second attempt the problem reproduced. Any ideas what is wrong here? Thanks, Victor N Victor wrote: Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N
Re: Connector: sessionCreated is not called (unstable behavior)
I tried to play with my fix - from the first glance, it seemed to work - connection was established, sessionCreated was called, but at the second attempt the problem reproduced. Any ideas what is wrong here? Thanks, Victor N Victor wrote: Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N
Connector: sessionCreated is not called (unstable behavior)
Hello, I have a problem with NIO connector in mina 2.0 M4 when handling many simultaneous connections - sometimes after calling connect() I do not receive sessionCreated() callback. I am trying to prepare a small test, but I can not reproduce the problem in this test yet, so I tried to investigate mina sources myself using out working system. I can see that a new Channel is always created, channel.connect() is called, SYN, SYN-ACK, ACK are sent, then I see that AbstractPollingIoConnector.Connector registers the new channel in Selector, but after that AbstractPollingIoConnector.Connector is stopped because nHandles=0 and connectQueue is already empty (no more connection requests). I mean the following code in AbstractPollingIoConnector.Connector.run(): if (nHandles == 0) { synchronized (lock) { if (connectQueue.isEmpty()) { connector = null; break; } } } After that, 'select' is not called on the Selector anymore for this session. One reason of this behavior (which I can assume now) is 'nHandles' variable initialized outside of 'while' loop. Maybe, it should be initialized on each iteration? So every time we calculate the number of new sessions, canceled sessions, etc. In my situation, nHandles=-1, then, after registering the new session, nHandles becomes 0 and the Connector.run() is stopped. Victor N