from:"Serge Baranov \(JIRA\)"

[jira] Commented: (DIRMINA-762) WARN org.apache.mina.core.service.IoProcessor - Create a new selector. Selected is 0, delta = 0

2010-02-10 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832023#action_12832023
 ] 

Serge Baranov commented on DIRMINA-762:
---

Emmanuel, that's exactly what I found yesterday, you should have started from 
that commit right away, see 
http://issues.apache.org/jira/browse/DIRMINA-678?focusedCommentId=12831297&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12831297
 .

> WARN org.apache.mina.core.service.IoProcessor  - Create a new selector. 
> Selected is 0, delta = 0
> 
>
> Key: DIRMINA-762
> URL: https://issues.apache.org/jira/browse/DIRMINA-762
> Project: MINA
>  Issue Type: Bug
> Environment: Linux (2.6.26-2-amd64),  java version 1.6.0_12 and also 
> 1.6.0_18.
>Reporter: Omry Yadan
>Priority: Critical
> Fix For: 2.0.0-RC2
>
> Attachments: BufferCodec.java, NettyTestServer.java, 
> RateCounter.java, Screen shot 2010-02-02 at 7.48.39 PM.png, Screen shot 
> 2010-02-02 at 7.48.46 PM.png, Screen shot 2010-02-02 at 7.48.59 PM.png, 
> Screen shot 2010-02-02 at 7.49.13 PM.png, Screen shot 2010-02-02 at 7.49.18 
> PM.png, Server.java, StressClient.java
>
>
> Mina server gets into a bad state where it constantly prints :
> WARN org.apache.mina.core.service.IoProcessor  - Create a new selector. 
> Selected is 0, delta = 0
> when this happens, server throughput drops significantly.
> to reproduce run the attached server and client for a short while (30 seconds 
> on my box).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2010-02-09 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831418#action_12831418
 ] 

Serge Baranov commented on DIRMINA-678:
---

Victor, I've sent you a patch by e-mail. It's just a zip with classes which 
replace the default JDK implementation (run with -Xbootclasspath/p:patch.zip 
command line option).

Emmanuel, it probably should not impact Windows in any case, but currently 
connections hang with no CPU usage when running latest MINA revision and 
everything works fine when running latest MINA revision, but without the 
selector code submitted as 42. That's why I've reported it. Sorry if it was 
not clear. Your fix breaks MINA on Windows, at least for my application.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC2
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2010-02-09 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831382#action_12831382
 ] 

Serge Baranov commented on DIRMINA-678:
---

I'll stay on 1.6.0_12 + Sun patch then. As for MINA, I've updated to the recent 
revision and reverted just the selector patch revisions, it seems to work fine 
now (at least passes the tests, didn't try in the production yet).

 I still see no reason for this code to be active on Windows platform as it was 
never affected by the Linux selector bug, OS check + option to disable it on 
any platform until the proper working patch is provided would be nice.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC2
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2010-02-09 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831374#action_12831374
 ] 

Serge Baranov commented on DIRMINA-678:
---

Interesting, I've been running with a patch from Sun under JDK 1.6.0_12 on 
Linux for almost a year and no spinning selector bug. This patch should be in 
1.6.0_18 as http://bugs.sun.com/view_bug.do?bug_id=6693490 is fixed in this 
version.

Maybe DIRMINA-762 is about another unfixed bug.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC2
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2010-02-09 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831363#action_12831363
 ] 

Serge Baranov commented on DIRMINA-678:
---

The issue is that the connections stall randomly, no data is sent or received 
when curl is connecting to Apache server via MINA based proxy server.

OS: Windows Vista 64-bit
JDK: 1.6.0_18 (32-bit)

The proxy is similar to the one I've provided to another issue: DIRMINA-734 
(mina-flush-regression.zip)

You can try running it with the latest MINA revision and connect to some server 
using curl via this proxy, make several concurrent connections, and curl will 
hang waiting for the data from the server.
No time for the isolated test case, sorry.

I suggest making this workaround optional and enable it if JDK version is < 
1.6.0_18 or via some setting/property.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC2
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2010-02-08 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831297#action_12831297
 ] 

Serge Baranov commented on DIRMINA-678:
---

I've updated to the latest revision of MINA (903897) and found that my 
integration tests fail on Windows platform.
Reverting back to older revisions and running tests on each of them shows that 
the problem has been introduced by this patch.

888197 - PASS
42 - FAIL

I'm afraid that the fix is not correct and should be reverted, especially 
considering that the offending NIO bug has been fixed in the recently released 
JDK 1.6.0_18.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC2
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-06 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740011#action_12740011
 ] 

Serge Baranov commented on DIRMINA-734:
---

Then you have a classpath issue. My example has exceptionCaught implemented and 
logging commented out. Also, I don't get the NPE you have as it would mean that 
the example provided with MINA is broken.

> Regression with flushing in MINA 2.0.0-M7 trunk
> ---
>
> Key: DIRMINA-734
> URL: https://issues.apache.org/jira/browse/DIRMINA-734
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M7
>Reporter: Serge Baranov
> Attachments: mina-flush-regression.zip
>
>
> It looks like the present trunk of MINA in the repository
> has a bug. Updating from M7 built on 06.06.2009 to the current trunk
> broke some of our tests.
> I've tried to build it today and ran our application tests, some of
> them failed.
> Reverting back to 06.06.2009 build has fixed the problem.
> At the first look it appears that session.close(false) behaves like
> session.close(true), as a result some messages are truncated (not
> being flushed on session.close(false)).
> If I comment out a call to session.close(false) (i.e. not closing the
> session at all when needed), the problem goes away.
> This behavior is inconsistent, test fails/succeeds randomly. On the
> first run it may pass, on the second run it may fail. Test fails about
> 10% of runs. The part which is not flushed is also random. Could be a
> race condition somewhere in MINA introduced in the last 2 months.
> I can't provide a test case yet, but our application is a kind of
> proxy, so it behaves like this:
> 1. client connects to the app and send a request
> 2. app connects to another server and sends a request
> 3. app gets a reply from the server and sends it back to the client
> 4. when another server closes the connection, app is closing the
>connection with the client using session.close(false)
> The app may be still flushing data to client when closing the
> connection. As I said it worked fine with previous MINA versions, but
> has broken only recently.
> We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
> if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-06 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739992#action_12739992
 ] 

Serge Baranov commented on DIRMINA-734:
---

You are probably running different sample (or the build-in example), or you 
have some classpath issue, as the sample I've attached has different package 
for the proxy example classes, you have

org.apache.mina.example.proxy.AbstractProxyIoHandler.messageReceived(AbstractProxyIoHandler.java:73)
 

while in my test it would be

foo.AbstractProxyIoHandler.messageReceived.

I've verified that the attached test case works fine.

> Regression with flushing in MINA 2.0.0-M7 trunk
> ---
>
> Key: DIRMINA-734
> URL: https://issues.apache.org/jira/browse/DIRMINA-734
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M7
>Reporter: Serge Baranov
> Attachments: mina-flush-regression.zip
>
>
> It looks like the present trunk of MINA in the repository
> has a bug. Updating from M7 built on 06.06.2009 to the current trunk
> broke some of our tests.
> I've tried to build it today and ran our application tests, some of
> them failed.
> Reverting back to 06.06.2009 build has fixed the problem.
> At the first look it appears that session.close(false) behaves like
> session.close(true), as a result some messages are truncated (not
> being flushed on session.close(false)).
> If I comment out a call to session.close(false) (i.e. not closing the
> session at all when needed), the problem goes away.
> This behavior is inconsistent, test fails/succeeds randomly. On the
> first run it may pass, on the second run it may fail. Test fails about
> 10% of runs. The part which is not flushed is also random. Could be a
> race condition somewhere in MINA introduced in the last 2 months.
> I can't provide a test case yet, but our application is a kind of
> proxy, so it behaves like this:
> 1. client connects to the app and send a request
> 2. app connects to another server and sends a request
> 3. app gets a reply from the server and sends it back to the client
> 4. when another server closes the connection, app is closing the
>connection with the client using session.close(false)
> The app may be still flushing data to client when closing the
> connection. As I said it worked fine with previous MINA versions, but
> has broken only recently.
> We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
> if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-05 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-734:
--

Attachment: mina-flush-regression.zip

I've prepared and attached a test for this issue. It's based on the MINA proxy 
example.
Please see the instructions inside README.txt.

This scenario requires an Apache server to reproduce, however you can use some 
large html page instead hosted on any public server, just change the options 
used for running the proxy and the client.

> Regression with flushing in MINA 2.0.0-M7 trunk
> ---
>
> Key: DIRMINA-734
> URL: https://issues.apache.org/jira/browse/DIRMINA-734
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M7
>Reporter: Serge Baranov
> Attachments: mina-flush-regression.zip
>
>
> It looks like the present trunk of MINA in the repository
> has a bug. Updating from M7 built on 06.06.2009 to the current trunk
> broke some of our tests.
> I've tried to build it today and ran our application tests, some of
> them failed.
> Reverting back to 06.06.2009 build has fixed the problem.
> At the first look it appears that session.close(false) behaves like
> session.close(true), as a result some messages are truncated (not
> being flushed on session.close(false)).
> If I comment out a call to session.close(false) (i.e. not closing the
> session at all when needed), the problem goes away.
> This behavior is inconsistent, test fails/succeeds randomly. On the
> first run it may pass, on the second run it may fail. Test fails about
> 10% of runs. The part which is not flushed is also random. Could be a
> race condition somewhere in MINA introduced in the last 2 months.
> I can't provide a test case yet, but our application is a kind of
> proxy, so it behaves like this:
> 1. client connects to the app and send a request
> 2. app connects to another server and sends a request
> 3. app gets a reply from the server and sends it back to the client
> 4. when another server closes the connection, app is closing the
>connection with the client using session.close(false)
> The app may be still flushing data to client when closing the
> connection. As I said it worked fine with previous MINA versions, but
> has broken only recently.
> We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
> if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-05 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739518#action_12739518
 ] 

Serge Baranov edited comment on DIRMINA-734 at 8/5/09 7:15 AM:
---

The application based on MINA is a proxy by itself, so it would be:

client <> our application based on mina <> some server

When the server closes connection, application closes connection with the 
client using session.close(false), same as in the example provided with MINA:
mina\example\src\main\java\org\apache\mina\example\proxy\AbstractProxyIoHandler.java:

public void sessionClosed(IoSession session) throws Exception {
if (session.getAttribute( OTHER_IO_SESSION ) != null) {
IoSession sess = (IoSession) session.getAttribute(OTHER_IO_SESSION);
sess.setAttribute(OTHER_IO_SESSION, null);
sess.close(false);
session.setAttribute(OTHER_IO_SESSION, null);
}
}


  was (Author: crazycoder):
The application based on MINA it a proxy by itself, so it would be:

client <> our application based on mina <> some server

When the server closes connection, application closes connection with the 
client using session.close(false), same as in the example provided with MINA:
mina\example\src\main\java\org\apache\mina\example\proxy\AbstractProxyIoHandler.java:

public void sessionClosed(IoSession session) throws Exception {
if (session.getAttribute( OTHER_IO_SESSION ) != null) {
IoSession sess = (IoSession) session.getAttribute(OTHER_IO_SESSION);
sess.setAttribute(OTHER_IO_SESSION, null);
sess.close(false);
session.setAttribute(OTHER_IO_SESSION, null);
}
}

  
> Regression with flushing in MINA 2.0.0-M7 trunk
> ---
>
> Key: DIRMINA-734
> URL: https://issues.apache.org/jira/browse/DIRMINA-734
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M7
>Reporter: Serge Baranov
>
> It looks like the present trunk of MINA in the repository
> has a bug. Updating from M7 built on 06.06.2009 to the current trunk
> broke some of our tests.
> I've tried to build it today and ran our application tests, some of
> them failed.
> Reverting back to 06.06.2009 build has fixed the problem.
> At the first look it appears that session.close(false) behaves like
> session.close(true), as a result some messages are truncated (not
> being flushed on session.close(false)).
> If I comment out a call to session.close(false) (i.e. not closing the
> session at all when needed), the problem goes away.
> This behavior is inconsistent, test fails/succeeds randomly. On the
> first run it may pass, on the second run it may fail. Test fails about
> 10% of runs. The part which is not flushed is also random. Could be a
> race condition somewhere in MINA introduced in the last 2 months.
> I can't provide a test case yet, but our application is a kind of
> proxy, so it behaves like this:
> 1. client connects to the app and send a request
> 2. app connects to another server and sends a request
> 3. app gets a reply from the server and sends it back to the client
> 4. when another server closes the connection, app is closing the
>connection with the client using session.close(false)
> The app may be still flushing data to client when closing the
> connection. As I said it worked fine with previous MINA versions, but
> has broken only recently.
> We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
> if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-05 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739518#action_12739518
 ] 

Serge Baranov commented on DIRMINA-734:
---

The application based on MINA it a proxy by itself, so it would be:

client <> our application based on mina <> some server

When the server closes connection, application closes connection with the 
client using session.close(false), same as in the example provided with MINA:
mina\example\src\main\java\org\apache\mina\example\proxy\AbstractProxyIoHandler.java:

public void sessionClosed(IoSession session) throws Exception {
if (session.getAttribute( OTHER_IO_SESSION ) != null) {
IoSession sess = (IoSession) session.getAttribute(OTHER_IO_SESSION);
sess.setAttribute(OTHER_IO_SESSION, null);
sess.close(false);
session.setAttribute(OTHER_IO_SESSION, null);
}
}


> Regression with flushing in MINA 2.0.0-M7 trunk
> ---
>
> Key: DIRMINA-734
> URL: https://issues.apache.org/jira/browse/DIRMINA-734
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M7
>Reporter: Serge Baranov
>
> It looks like the present trunk of MINA in the repository
> has a bug. Updating from M7 built on 06.06.2009 to the current trunk
> broke some of our tests.
> I've tried to build it today and ran our application tests, some of
> them failed.
> Reverting back to 06.06.2009 build has fixed the problem.
> At the first look it appears that session.close(false) behaves like
> session.close(true), as a result some messages are truncated (not
> being flushed on session.close(false)).
> If I comment out a call to session.close(false) (i.e. not closing the
> session at all when needed), the problem goes away.
> This behavior is inconsistent, test fails/succeeds randomly. On the
> first run it may pass, on the second run it may fail. Test fails about
> 10% of runs. The part which is not flushed is also random. Could be a
> race condition somewhere in MINA introduced in the last 2 months.
> I can't provide a test case yet, but our application is a kind of
> proxy, so it behaves like this:
> 1. client connects to the app and send a request
> 2. app connects to another server and sends a request
> 3. app gets a reply from the server and sends it back to the client
> 4. when another server closes the connection, app is closing the
>connection with the client using session.close(false)
> The app may be still flushing data to client when closing the
> connection. As I said it worked fine with previous MINA versions, but
> has broken only recently.
> We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
> if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (DIRMINA-734) Regression with flushing in MINA 2.0.0-M7 trunk

2009-08-04 Thread Serge Baranov (JIRA)

Regression with flushing in MINA 2.0.0-M7 trunk
---

 Key: DIRMINA-734
 URL: https://issues.apache.org/jira/browse/DIRMINA-734
 Project: MINA
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0.0-M7
Reporter: Serge Baranov


It looks like the present trunk of MINA in the repository
has a bug. Updating from M7 built on 06.06.2009 to the current trunk
broke some of our tests.

I've tried to build it today and ran our application tests, some of
them failed.

Reverting back to 06.06.2009 build has fixed the problem.

At the first look it appears that session.close(false) behaves like
session.close(true), as a result some messages are truncated (not
being flushed on session.close(false)).

If I comment out a call to session.close(false) (i.e. not closing the
session at all when needed), the problem goes away.

This behavior is inconsistent, test fails/succeeds randomly. On the
first run it may pass, on the second run it may fail. Test fails about
10% of runs. The part which is not flushed is also random. Could be a
race condition somewhere in MINA introduced in the last 2 months.

I can't provide a test case yet, but our application is a kind of
proxy, so it behaves like this:

1. client connects to the app and send a request
2. app connects to another server and sends a request
3. app gets a reply from the server and sends it back to the client
4. when another server closes the connection, app is closing the
   connection with the client using session.close(false)

The app may be still flushing data to client when closing the
connection. As I said it worked fine with previous MINA versions, but
has broken only recently.

We are using OrderedThreadPoolExecutor and CumulativeProtocolDecoder
if that matters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-687) 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes

2009-04-28 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703524#action_12703524
 ] 

Serge Baranov commented on DIRMINA-687:
---

> Did it worked on 2.0.0-M4 ? 

Yes, you can run the attached test and see the difference yourself.

> 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes
> ---
>
> Key: DIRMINA-687
> URL: https://issues.apache.org/jira/browse/DIRMINA-687
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M5
>Reporter: Serge Baranov
>Assignee: Emmanuel Lecharny
>Priority: Critical
> Fix For: 2.0.0-RC1
>
> Attachments: regression-test.zip
>
>
> The problem is described in detail at 
> http://markmail.org/message/3iqmkdcukhygxxva .
> I did more tests by setting the CLOSED custom attribute inside sessionClosed 
> and by checking for this attribute in the middle of doDecode. 
> Under some networking conditions on the production server I can see the 
> following in the logs:
> [2009-04-16 22:30:17,329] Session closed in the middle of decoding!
> [2009-04-16 22:30:21,126] Session closed in the middle of decoding!
> [2009-04-16 22:30:22,345] Session closed in the middle of decoding!
> [2009-04-16 22:30:22,969] Session closed in the middle of decoding!
> [2009-04-16 22:30:25,217] Session closed in the middle of decoding!
> [2009-04-16 22:30:25,840] Session closed in the middle of decoding!
> [2009-04-16 22:30:59,958] Session closed in the middle of decoding!
> Unfortunately, I didn't manage to prepare an isolated test case for this 
> issue, must be some race condition which is tricky to catch.
> Most likely the problem is caused by this commit: 
> http://svn.apache.org/viewvc?view=rev&revision=762167 .
> As mentioned, this problem is new to 2.0.0-M5, worked fine for many months 
> with 2.0.0-M4 and previous builds. Rolling back to M4 builds fixes the 
> problem instantly.
> OrderedThreadPool must guarantee the order of events, sessionClosed must not 
> be called when doDecode is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-687) 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes

2009-04-16 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-687:
--

Attachment: regression-test.zip

Once I've understood better what is happening, I managed to write a proper test 
case for this bug. It appears that sessionClosed is indeed invoked before 
doDecode completes the packet decoding.

Output of the test with 2.0.0-M4:

[2009-04-17 03:01:19,042] START
[2009-04-17 03:01:21,182] 9> done decoding
[2009-04-17 03:01:21,182] 9> Session closed
[2009-04-17 03:01:21,183] 4> done decoding
[2009-04-17 03:01:21,183] 4> Session closed
[2009-04-17 03:01:21,183] 10> done decoding
[2009-04-17 03:01:21,183] 10> Session closed
[2009-04-17 03:01:21,183] 18> done decoding
[2009-04-17 03:01:21,184] 18> Session closed
[2009-04-17 03:01:21,682] 14> done decoding
[2009-04-17 03:01:21,682] 20> done decoding
[2009-04-17 03:01:21,682] 14> Session closed
[2009-04-17 03:01:21,682] 20> Session closed
[2009-04-17 03:01:22,177] 13> done decoding
[2009-04-17 03:01:22,177] 17> done decoding
[2009-04-17 03:01:22,177] 13> Session closed
[2009-04-17 03:01:22,177] 17> Session closed
[2009-04-17 03:01:22,184] 2> done decoding
[2009-04-17 03:01:22,184] 2> Session closed
[2009-04-17 03:01:22,185] 5> done decoding
[2009-04-17 03:01:22,185] 5> Session closed
[2009-04-17 03:01:22,185] Received: 0
[2009-04-17 03:01:22,185] Sent: 10
[2009-04-17 03:01:22,189] FINISH


Output with 2.0.0-M5:

[2009-04-17 03:00:21,577] START
[2009-04-17 03:00:22,221] 4> Session closed
[2009-04-17 03:00:22,221] 15> Session closed
[2009-04-17 03:00:22,719] 4> !session closed before decoding completes!
[2009-04-17 03:00:22,719] 15> !session closed before decoding completes!
[2009-04-17 03:00:22,719] 15> !decoding for closed session!
[2009-04-17 03:00:22,719] 4> !decoding for closed session!
[2009-04-17 03:00:22,721] 17> Session closed
[2009-04-17 03:00:22,722] 19> Session closed
[2009-04-17 03:00:22,723] 21> Session closed
[2009-04-17 03:00:23,218] 13> Session closed
[2009-04-17 03:00:23,221] 17> !session closed before decoding completes!
[2009-04-17 03:00:23,221] 17> !decoding for closed session!
[2009-04-17 03:00:23,222] 19> !session closed before decoding completes!
[2009-04-17 03:00:23,222] 19> !decoding for closed session!
[2009-04-17 03:00:23,223] 21> !session closed before decoding completes!
[2009-04-17 03:00:23,223] 21> !decoding for closed session!
[2009-04-17 03:00:23,717] 13> !session closed before decoding completes!
[2009-04-17 03:00:23,717] 13> !decoding for closed session!
[2009-04-17 03:00:23,719] 4> !session closed before decoding completes!
[2009-04-17 03:00:23,719] 15> !session closed before decoding completes!
[2009-04-17 03:00:23,719] 4> done decoding
[2009-04-17 03:00:23,719] 15> done decoding
[2009-04-17 03:00:23,721] 6> done decoding
[2009-04-17 03:00:23,721] 6> Session closed
[2009-04-17 03:00:23,722] 10> done decoding
[2009-04-17 03:00:23,722] 9> done decoding
[2009-04-17 03:00:23,722] 10> Session closed
[2009-04-17 03:00:23,722] 9> Session closed
[2009-04-17 03:00:24,221] 17> !session closed before decoding completes!
[2009-04-17 03:00:24,221] 17> done decoding
[2009-04-17 03:00:24,222] 19> !session closed before decoding completes!
[2009-04-17 03:00:24,222] 19> done decoding
[2009-04-17 03:00:24,223] 21> !session closed before decoding completes!
[2009-04-17 03:00:24,223] 21> done decoding
[2009-04-17 03:00:24,717] 2> done decoding
[2009-04-17 03:00:24,717] 13> !session closed before decoding completes!
[2009-04-17 03:00:24,717] 2> Session closed
[2009-04-17 03:00:24,717] 13> done decoding
[2009-04-17 03:00:24,717] Received: 0
[2009-04-17 03:00:24,717] Sent: 10
[2009-04-17 03:00:24,721] FINISH

The test is designed so that the server session closes itself from another 
thread while doDecode is in progress.
M4 continues decoding while M5 calls sessionClosed even if decoding is in 
progress which removes session attributes used in doDecode. Order of events is 
violated.

The test case is sensitive to the position of the executor in the chain. 
Executor must be located before the codec. If it's after decoder, the output 
will be different (you will not see "done decoding" lines at all and the bug 
will be not reproducible).

With M5 the test doesn't reproduce the bug in 100% of runs, so one may need to 
run several times or increase the number of messages and/or wait times.

> 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes
> ---
>
> Key: DIRMINA-687
> URL: https://issues.apache.org/jira/browse/DIRMINA-687
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M5
>Reporter: Serge Baranov
>Priority: Critical
> Attachments: regression-test.zip
>
>
> The problem is described in detail at 
> http://markmail.

[jira] Updated: (DIRMINA-687) 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes

2009-04-16 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-687:
--

Component/s: Core

> 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes
> ---
>
> Key: DIRMINA-687
> URL: https://issues.apache.org/jira/browse/DIRMINA-687
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M5
>Reporter: Serge Baranov
>Priority: Critical
>
> The problem is described in detail at 
> http://markmail.org/message/3iqmkdcukhygxxva .
> I did more tests by setting the CLOSED custom attribute inside sessionClosed 
> and by checking for this attribute in the middle of doDecode. 
> Under some networking conditions on the production server I can see the 
> following in the logs:
> [2009-04-16 22:30:17,329] Session closed in the middle of decoding!
> [2009-04-16 22:30:21,126] Session closed in the middle of decoding!
> [2009-04-16 22:30:22,345] Session closed in the middle of decoding!
> [2009-04-16 22:30:22,969] Session closed in the middle of decoding!
> [2009-04-16 22:30:25,217] Session closed in the middle of decoding!
> [2009-04-16 22:30:25,840] Session closed in the middle of decoding!
> [2009-04-16 22:30:59,958] Session closed in the middle of decoding!
> Unfortunately, I didn't manage to prepare an isolated test case for this 
> issue, must be some race condition which is tricky to catch.
> Most likely the problem is caused by this commit: 
> http://svn.apache.org/viewvc?view=rev&revision=762167 .
> As mentioned, this problem is new to 2.0.0-M5, worked fine for many months 
> with 2.0.0-M4 and previous builds. Rolling back to M4 builds fixes the 
> problem instantly.
> OrderedThreadPool must guarantee the order of events, sessionClosed must not 
> be called when doDecode is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (DIRMINA-687) 2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes

2009-04-16 Thread Serge Baranov (JIRA)

2.0.0-M5 REGRESSION: sessionClosed called before doDecode completes
---

 Key: DIRMINA-687
 URL: https://issues.apache.org/jira/browse/DIRMINA-687
 Project: MINA
  Issue Type: Bug
Affects Versions: 2.0.0-M5
Reporter: Serge Baranov
Priority: Critical


The problem is described in detail at 
http://markmail.org/message/3iqmkdcukhygxxva .

I did more tests by setting the CLOSED custom attribute inside sessionClosed 
and by checking for this attribute in the middle of doDecode. 

Under some networking conditions on the production server I can see the 
following in the logs:

[2009-04-16 22:30:17,329] Session closed in the middle of decoding!
[2009-04-16 22:30:21,126] Session closed in the middle of decoding!
[2009-04-16 22:30:22,345] Session closed in the middle of decoding!
[2009-04-16 22:30:22,969] Session closed in the middle of decoding!
[2009-04-16 22:30:25,217] Session closed in the middle of decoding!
[2009-04-16 22:30:25,840] Session closed in the middle of decoding!
[2009-04-16 22:30:59,958] Session closed in the middle of decoding!

Unfortunately, I didn't manage to prepare an isolated test case for this issue, 
must be some race condition which is tricky to catch.

Most likely the problem is caused by this commit: 
http://svn.apache.org/viewvc?view=rev&revision=762167 .

As mentioned, this problem is new to 2.0.0-M5, worked fine for many months with 
2.0.0-M4 and previous builds. Rolling back to M4 builds fixes the problem 
instantly.
OrderedThreadPool must guarantee the order of events, sessionClosed must not be 
called when doDecode is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-04-14 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698900#action_12698900
 ] 

Serge Baranov commented on DIRMINA-678:
---

You will see that one of the NioProcessor threads starts to use 100% CPU, then 
another, then yet another. There is no definitive test I'm aware of. The 
problem occurs when about 1000-2000 concurrent connections are being performed 
to the server for some time.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-04-14 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698885#action_12698885
 ] 

Serge Baranov commented on DIRMINA-678:
---

Yes, it was epollWait() in most of the traces.

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-04-07 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696745#action_12696745
 ] 

Serge Baranov commented on DIRMINA-678:
---

GC was not a problem in my case and it didn't occur much more frequently in 
case of the bug described in this issue (according to the profiler snapshots).

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-03-19 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683685#action_12683685
 ] 

Serge Baranov commented on DIRMINA-678:
---

I've got patches from Sun fixing this issue for JDK 1.6.0_12 and JDK 1.7.0 b50. 
According to them, the fix will go into 6u15 or 6u16. Servers are running for 
more than 10 hours with no problems after applying the patch.

Unfortunately, it's for testers only and I'm not allowed to share it, so please 
don't ask.

The root case of the problem is related to this bug: 
http://bugs.sun.com/view_bug.do?bug_id=6693490 .

Quoting JDK developer:

"In summary, the bug we have in the epoll 
Selector relates to a timing issue between select and close that results 
in the preclose file descriptor getting registered with the epoll 
facility. The preclose file descriptor is connected to a socketpair 
where the other end is closed so the end that is registered in epoll is 
always pollable (and so causes the spin)."

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Fix For: 2.0.0
>
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-03-19 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683407#action_12683407
 ] 

Serge Baranov commented on DIRMINA-678:
---

Following the suggestion from MINA maintainers, tried the IBM JDK and also 
reproduced the problem with the following version:

Java(TM) SE Runtime Environment (build pxi3260sr4-20090219_01(SR4))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux x86-32 
jvmxi3260-20090215_29883 (JIT enabled, AOT enabled)

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-03-18 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-678:
--

Attachment: snap974.png
snap973.png

> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> -
>
> Key: DIRMINA-678
> URL: https://issues.apache.org/jira/browse/DIRMINA-678
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
> Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
> _11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
> versions,
>Reporter: Serge Baranov
> Attachments: snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
> but affecting MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
> of running depending on the load (sometimes after several hours) one of the 
> NioProcessor starts to consume all the available CPU resources probably 
> spinning in the epoll select loop. Later, more threads can be affected by the 
> same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the 
> specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can 
> also think about a workaround.
> As far as I know, there are at least 3 users who experience this issue with 
> Jetty and all of them are running CentOS (some distribution default setting 
> is a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)

2009-03-18 Thread Serge Baranov (JIRA)

NioProcessor 100% CPU usage on Linux (epoll selector bug)
-

 Key: DIRMINA-678
 URL: https://issues.apache.org/jira/browse/DIRMINA-678
 Project: MINA
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0.0-M4
 Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also 
_11/_10/_09 and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older 
versions,
Reporter: Serge Baranov


It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , 
but affecting MINA in the very similar way.

NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes 
of running depending on the load (sometimes after several hours) one of the 
NioProcessor starts to consume all the available CPU resources probably 
spinning in the epoll select loop. Later, more threads can be affected by the 
same issue, thus 100% loading all the available CPU cores.

Sample trace:

NioProcessor-10 [RUNNABLE] CPU time: 5:15
sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
sun.nio.ch.EPollArrayWrapper.poll(long)
sun.nio.ch.EPollSelectorImpl.doSelect(long)
sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
sun.nio.ch.SelectorImpl.select(long)
org.apache.mina.transport.socket.nio.NioProcessor.select(long)
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
org.apache.mina.util.NamePreservingRunnable.run()
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
java.util.concurrent.ThreadPoolExecutor$Worker.run()
java.lang.Thread.run()

It seems to affect any NIO based Java server applications running in the 
specified environment.
Some projects provide workarounds for similar JDK bugs, probably MINA can also 
think about a workaround.

As far as I know, there are at least 3 users who experience this issue with 
Jetty and all of them are running CentOS (some distribution default setting is 
a trigger?). As for MINA, I'm not aware of similar reports yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-665) AbstractIoSession#getId() can cause collisions which lead to sessionClosed calls without sessionCreated

2009-02-28 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-665:
--

Attachment: uniqueid.patch

Patch with a temporary fix to this problem is attached.

> AbstractIoSession#getId() can cause collisions which lead to sessionClosed 
> calls without sessionCreated
> ---
>
> Key: DIRMINA-665
> URL: https://issues.apache.org/jira/browse/DIRMINA-665
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC1
>
> Attachments: DuplicateSessionTest.java, uniqueid.patch
>
>
> While investigating the root case of the problem described at 
> http://www.nabble.com/Counting-connections-in-MINA-tt22200162.html I found 
> that it's most likely caused by the collisions of the session ids.
> See org.apache.mina.core.service.IoServiceListenerSupport#fireSessionCreated:
> // If already registered, ignore.
> if (managedSessions.putIfAbsent(Long.valueOf(session.getId()), session) != 
> null) {
> return;
> }
> If the new created session has the same id as the one already managed by the 
> service, this method returns and sessionCreated/sessionOpened are not invoked 
> in the IoHandler. It's not surprising as the JavaDoc for getId() says:
> TODO this method implementation is totally wrong. It has to be rewritten.
> This problem is pretty serious as under heavy load you will get several 
> collisions per hour. Basically, you will get sessionClosed with empty 
> attributes map which will not correspond to any sessionOpened/sessionCreated. 
> Which leads to inability to track anything via session attributes (consider a 
> simple counter for connected IP addresses which gives you a number of users 
> connected per IP).
> There is probably also some race condition as in some cases duplicate id 
> doesn't cause such problem. It must be investigated further.
> I've rewritten the getId() method to use AtomicLong incremented in the 
> constructor of the class and it has fixed the problem for me.
> I'm attaching the test case which reproduces the problem on my machine. You 
> may need to run it several times to get the expected result or adjust the 
> number of created sessions and the packet size.
> Sample output with current getId() implementation:
> [2009-03-01 01:06:43,070] START
> [2009-03-01 01:06:44,503] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x01028859: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:47,172] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012BDC2C: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:51,187] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012C0881: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:55,398] FINISH
> Sample output with getId() implementation using AtomicLong():
> [2009-03-01 01:08:00,728] START
> [2009-03-01 01:08:12,653] FINISH
> I have no time for additional investigations for now, hope it helps.
> The proposed solution is to either rewrite id generation or make the behavior 
> consistent in case duplicate id is generated. Duplicate ids should not stop 
> the application from normal operation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-665) AbstractIoSession#getId() can cause collisions which lead to sessionClosed calls without sessionCreated

2009-02-28 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677747#action_12677747
 ] 

Serge Baranov commented on DIRMINA-665:
---

Yes, I did. It's mentioned in the original report. My implementation using 
AtomicLong() works fine. See the outputs provided.

> AbstractIoSession#getId() can cause collisions which lead to sessionClosed 
> calls without sessionCreated
> ---
>
> Key: DIRMINA-665
> URL: https://issues.apache.org/jira/browse/DIRMINA-665
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC1
>
> Attachments: DuplicateSessionTest.java
>
>
> While investigating the root case of the problem described at 
> http://www.nabble.com/Counting-connections-in-MINA-tt22200162.html I found 
> that it's most likely caused by the collisions of the session ids.
> See org.apache.mina.core.service.IoServiceListenerSupport#fireSessionCreated:
> // If already registered, ignore.
> if (managedSessions.putIfAbsent(Long.valueOf(session.getId()), session) != 
> null) {
> return;
> }
> If the new created session has the same id as the one already managed by the 
> service, this method returns and sessionCreated/sessionOpened are not invoked 
> in the IoHandler. It's not surprising as the JavaDoc for getId() says:
> TODO this method implementation is totally wrong. It has to be rewritten.
> This problem is pretty serious as under heavy load you will get several 
> collisions per hour. Basically, you will get sessionClosed with empty 
> attributes map which will not correspond to any sessionOpened/sessionCreated. 
> Which leads to inability to track anything via session attributes (consider a 
> simple counter for connected IP addresses which gives you a number of users 
> connected per IP).
> There is probably also some race condition as in some cases duplicate id 
> doesn't cause such problem. It must be investigated further.
> I've rewritten the getId() method to use AtomicLong incremented in the 
> constructor of the class and it has fixed the problem for me.
> I'm attaching the test case which reproduces the problem on my machine. You 
> may need to run it several times to get the expected result or adjust the 
> number of created sessions and the packet size.
> Sample output with current getId() implementation:
> [2009-03-01 01:06:43,070] START
> [2009-03-01 01:06:44,503] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x01028859: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:47,172] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012BDC2C: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:51,187] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012C0881: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:55,398] FINISH
> Sample output with getId() implementation using AtomicLong():
> [2009-03-01 01:08:00,728] START
> [2009-03-01 01:08:12,653] FINISH
> I have no time for additional investigations for now, hope it helps.
> The proposed solution is to either rewrite id generation or make the behavior 
> consistent in case duplicate id is generated. Duplicate ids should not stop 
> the application from normal operation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-665) AbstractIoSession#getId() can cause collisions which lead to sessionClosed calls without sessionCreated

2009-02-28 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-665:
--

Attachment: DuplicateSessionTest.java

A test case illustrating the collisions problem and sessionClosed call without 
sessionCreated call.

> AbstractIoSession#getId() can cause collisions which lead to sessionClosed 
> calls without sessionCreated
> ---
>
> Key: DIRMINA-665
> URL: https://issues.apache.org/jira/browse/DIRMINA-665
> Project: MINA
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0-M4
>Reporter: Serge Baranov
> Fix For: 2.0.0-RC1
>
> Attachments: DuplicateSessionTest.java
>
>
> While investigating the root case of the problem described at 
> http://www.nabble.com/Counting-connections-in-MINA-tt22200162.html I found 
> that it's most likely caused by the collisions of the session ids.
> See org.apache.mina.core.service.IoServiceListenerSupport#fireSessionCreated:
> // If already registered, ignore.
> if (managedSessions.putIfAbsent(Long.valueOf(session.getId()), session) != 
> null) {
> return;
> }
> If the new created session has the same id as the one already managed by the 
> service, this method returns and sessionCreated/sessionOpened are not invoked 
> in the IoHandler. It's not surprising as the JavaDoc for getId() says:
> TODO this method implementation is totally wrong. It has to be rewritten.
> This problem is pretty serious as under heavy load you will get several 
> collisions per hour. Basically, you will get sessionClosed with empty 
> attributes map which will not correspond to any sessionOpened/sessionCreated. 
> Which leads to inability to track anything via session attributes (consider a 
> simple counter for connected IP addresses which gives you a number of users 
> connected per IP).
> There is probably also some race condition as in some cases duplicate id 
> doesn't cause such problem. It must be investigated further.
> I've rewritten the getId() method to use AtomicLong incremented in the 
> constructor of the class and it has fixed the problem for me.
> I'm attaching the test case which reproduces the problem on my machine. You 
> may need to run it several times to get the expected result or adjust the 
> number of created sessions and the packet size.
> Sample output with current getId() implementation:
> [2009-03-01 01:06:43,070] START
> [2009-03-01 01:06:44,503] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x01028859: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:47,172] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012BDC2C: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:51,187] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
> (0x012C0881: nio socket, server, null => null) / DUPLICATE ID: true
> [2009-03-01 01:06:55,398] FINISH
> Sample output with getId() implementation using AtomicLong():
> [2009-03-01 01:08:00,728] START
> [2009-03-01 01:08:12,653] FINISH
> I have no time for additional investigations for now, hope it helps.
> The proposed solution is to either rewrite id generation or make the behavior 
> consistent in case duplicate id is generated. Duplicate ids should not stop 
> the application from normal operation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (DIRMINA-665) AbstractIoSession#getId() can cause collisions which lead to sessionClosed calls without sessionCreated

2009-02-28 Thread Serge Baranov (JIRA)

AbstractIoSession#getId() can cause collisions which lead to sessionClosed 
calls without sessionCreated
---

 Key: DIRMINA-665
 URL: https://issues.apache.org/jira/browse/DIRMINA-665
 Project: MINA
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0.0-M4
Reporter: Serge Baranov
 Fix For: 2.0.0-RC1


While investigating the root case of the problem described at 
http://www.nabble.com/Counting-connections-in-MINA-tt22200162.html I found that 
it's most likely caused by the collisions of the session ids.

See org.apache.mina.core.service.IoServiceListenerSupport#fireSessionCreated:

// If already registered, ignore.
if (managedSessions.putIfAbsent(Long.valueOf(session.getId()), session) != 
null) {
return;
}

If the new created session has the same id as the one already managed by the 
service, this method returns and sessionCreated/sessionOpened are not invoked 
in the IoHandler. It's not surprising as the JavaDoc for getId() says:

TODO this method implementation is totally wrong. It has to be rewritten.

This problem is pretty serious as under heavy load you will get several 
collisions per hour. Basically, you will get sessionClosed with empty 
attributes map which will not correspond to any sessionOpened/sessionCreated. 
Which leads to inability to track anything via session attributes (consider a 
simple counter for connected IP addresses which gives you a number of users 
connected per IP).

There is probably also some race condition as in some cases duplicate id 
doesn't cause such problem. It must be investigated further.

I've rewritten the getId() method to use AtomicLong incremented in the 
constructor of the class and it has fixed the problem for me.

I'm attaching the test case which reproduces the problem on my machine. You may 
need to run it several times to get the expected result or adjust the number of 
created sessions and the packet size.

Sample output with current getId() implementation:

[2009-03-01 01:06:43,070] START
[2009-03-01 01:06:44,503] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
(0x01028859: nio socket, server, null => null) / DUPLICATE ID: true
[2009-03-01 01:06:47,172] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
(0x012BDC2C: nio socket, server, null => null) / DUPLICATE ID: true
[2009-03-01 01:06:51,187] DUPLICATE SESSION CLOSED WITHOUT CREATED/OPEN: 
(0x012C0881: nio socket, server, null => null) / DUPLICATE ID: true
[2009-03-01 01:06:55,398] FINISH

Sample output with getId() implementation using AtomicLong():

[2009-03-01 01:08:00,728] START
[2009-03-01 01:08:12,653] FINISH

I have no time for additional investigations for now, hope it helps.

The proposed solution is to either rewrite id generation or make the behavior 
consistent in case duplicate id is generated. Duplicate ids should not stop the 
application from normal operation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-17 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674171#action_12674171
 ] 

Serge Baranov commented on DIRMINA-663:
---

I found the root case of the problem. Between M3 and M4 the static initializer 
has been removed from DefaultSocketSessionConfig.java.
This initializer was setting the default session config parameters taken from 
the platform defaults.

Since M4 the hardcoded defaults are used instead of the platform defaults.

In my particular case the hardcoded receive buffer size = 1024 was causing this 
performance problem. The system default taken from the socket is 8192.

I can now easily fix the problem by adding this code to the test case:
acceptor.getSessionConfig().setReceiveBufferSize(8192);

It actually works fine with values larger than 1024, like 2048 or 4096.

In the commit comments I see that this code was removed because it causes 
problems with Applet or on Vista which is: DIRMINA-638.
There is already a comment with exactly the same problem as I've submitted here.
https://issues.apache.org/jira/browse/DIRMINA-638?focusedCommentId=12667699#action_12667699

Current issue can be closed as the duplicate of DIRMINA-638.

> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
> Attachments: tdump.txt
>
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-16 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673872#action_12673872
 ] 

Serge Baranov commented on DIRMINA-663:
---

Done more testing:

CentOS 5.2 - works fine with M4/RC1
Windows 2003 Server x64 -- works fine with M4/RC1
Windows Vista 32-bit laptop -- works fine with M4/RC1
Windows VIsta 64-bit clean VMWare image -- works fine with M4/RC1
Windows VIsta 64-bit real system -- fails with M4/RC1, works fine with M3 
(under any JVM)

So, it looks like the problem is specific to both the windows 
environment/hardware and the MINA version. Really puzzled. Any ideas?

> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
> Attachments: tdump.txt
>
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-16 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-663:
--

Attachment: tdump.txt

Thread dump taken during the 5-second wait pause is attached.

> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
> Attachments: tdump.txt
>
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-16 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673845#action_12673845
 ] 

Serge Baranov commented on DIRMINA-663:
---

Here it is (same with 1.6.0_11 and 1.6.0_12). Line numbers may not match 
because of code reformat in the IDE, running main() stub instead of JUnit 
runner:

time   = 13:53:26.167
remaining  = 1024
limit  = 1024
capacity   = 2048
Exception in thread "main" java.lang.AssertionError: No message received
at org.junit.Assert.fail(Assert.java:91)
at mina_test.TestingIoHandler.waitForMessage(TestingIoHandler.java:33)
at mina_test.JUTestBigFrame.bigFrame(JUTestBigFrame.java:25)
at mina_test.JUTestBigFrame.main(JUTestBigFrame.java:33)
time   = 13:53:31.465
remaining  = 2048
limit  = 2048
capacity   = 2097156
time   = 13:53:36.465
remaining  = 3072
limit  = 3072
capacity   = 2097156
time   = 13:53:41.465
remaining  = 4096
limit  = 4096
capacity   = 2097156
time   = 13:53:46.465
remaining  = 5120
limit  = 5120
capacity   = 2097156



> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-15 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673757#action_12673757
 ] 

Serge Baranov commented on DIRMINA-663:
---

As stated in the Environment field, I am already using 32-bit JDK under 64-bit 
system. As per your suggestion I've downloaded and tried jrmc-3.0.3-1.6.0 
(32-bit), it has the same problem. 64-bit JVMs also have the same issue with 
this test. 2.0.0-M3 works fine, starting from 2.0.0-M4 it's broken again.

> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-14 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673567#action_12673567
 ] 

Serge Baranov commented on DIRMINA-663:
---

2.0.0-M4 is also affected by this bug.

> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (DIRMINA-609) Big performance drop in 2.0M2 CumulativeProtocolDecoder for large messages

2009-02-14 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673566#action_12673566
 ] 

crazycoder edited comment on DIRMINA-609 at 2/14/09 8:14 PM:


RC1 trunk has a regression which breaks this test again: DIRMINA-663 . I've 
verified that it works fine with M3, but not with RC1 or M4.

  was (Author: crazycoder):
RC1 trunk has a regression which breaks this test again: DIRMINA-663 . I've 
verified that it works fine with M3, but not with RC1.
  
> Big performance drop in 2.0M2 CumulativeProtocolDecoder for large messages
> --
>
> Key: DIRMINA-609
> URL: https://issues.apache.org/jira/browse/DIRMINA-609
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-M2
> Environment: Mac OS X 10.5 (Intel), Java 1.5
>Reporter: Matthew Phillips
>Assignee: Julien Vermillard
> Fix For: 2.0.0-M3
>
> Attachments: mina_test_1.1.zip, mina_test_2.0.zip
>
>
> Porting from MINA 1.1 to 2.0M2, one of my tests pinpointed a huge drop in the 
> performance of a CumulativeProtocolDecoder-based codec when transmitting 
> large (>2MB) messages. Under MINA 1.1.x, the transfer was effectively 
> immediate, under 2.0M2 the system seems to get bogged down and calls the 
> doDecode () method in 1-second intervals, with slowly-increasing amounts of 
> data. The system is fast quad-core system, lightly unloaded. Almost no CPU 
> usage is recorded when running either test.
> The two attached ZIP files have test cases against 1.1 and 2.0M2 that 
> reproduce this. Their trace outputs follow: the MINA 1.1 case terminates in 
> under a second, the 2.0 case times out and fails after 5 seconds.
> MINA 1.1
> --
> time   = 14:04:00.934
> remaining  = 1024
> limit  = 1024
> capacity   = 1024
> time   = 14:04:00.946
> remaining  = 3072
> limit  = 3072
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 7168
> limit  = 7168
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 15360
> limit  = 15360
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 31744
> limit  = 31744
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 64512
> limit  = 64512
> capacity   = 4194304
> time   = 14:04:00.948
> remaining  = 130048
> limit  = 130048
> capacity   = 4194304
> time   = 14:04:00.948
> remaining  = 261120
> limit  = 261120
> capacity   = 4194304
> time   = 14:04:00.949
> remaining  = 523264
> limit  = 523264
> capacity   = 4194304
> time   = 14:04:00.951
> remaining  = 1047552
> limit  = 1047552
> capacity   = 4194304
> time   = 14:04:00.953
> remaining  = 1571840
> limit  = 1571840
> capacity   = 4194304
> time   = 14:04:00.954
> remaining  = 2096128
> limit  = 2096128
> capacity   = 4194304
> time   = 14:04:00.955
> remaining  = 2097156
> limit  = 2097156
> capacity   = 4194304
> frame size = 2097152
> 2.0
> --
> time   = 14:01:59.154
> remaining  = 2048
> limit  = 2048
> capacity   = 2048
> time   = 14:01:59.175
> remaining  = 6144
> limit  = 6144
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 14336
> limit  = 14336
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 30720
> limit  = 30720
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 63488
> limit  = 63488
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 129024
> limit  = 129024
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 194560
> limit  = 194560
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 196608
> limit  = 196608
> capacity   = 2097156
> time   = 14:02:00.153
> remaining  = 262144
> limit  = 262144
> capacity   = 2097156
> time   = 14:02:00.153
> remaining  = 294912
> limit  = 294912
> capacity   = 2097156
> time   = 14:02:01.153
> remaining  = 360448
> limit  = 360448
> capacity   = 2097156
> time   = 14:02:01.154
> remaining  = 393216
> limit  = 393216
> capacity   = 2097156
> time   = 14:02:02.154
> remaining  = 458752
> limit  = 458752
> capacity   = 2097156
> time   = 14:02:02.154
> remaining  = 491520
> limit  = 491520
> capacity   = 2097156
> time   = 14:02:03.154
> remaining  = 557056
> limit  = 557056
> capacity   = 2097156
> time   = 14:02:03.155
> remaining  = 589824
> limit  = 589824
> capacity   = 2097156
> time   = 14:02:04.154
> remaining  = 655360
> limit  = 655360
> capacity   = 2097156
> time   = 14:02:04.155
> remaini

[jira] Commented: (DIRMINA-609) Big performance drop in 2.0M2 CumulativeProtocolDecoder for large messages

2009-02-14 Thread Serge Baranov (JIRA)


[ 
https://issues.apache.org/jira/browse/DIRMINA-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673566#action_12673566
 ] 

Serge Baranov commented on DIRMINA-609:
---

RC1 trunk has a regression which breaks this test again: DIRMINA-663 . I've 
verified that it works fine with M3, but not with RC1.

> Big performance drop in 2.0M2 CumulativeProtocolDecoder for large messages
> --
>
> Key: DIRMINA-609
> URL: https://issues.apache.org/jira/browse/DIRMINA-609
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-M2
> Environment: Mac OS X 10.5 (Intel), Java 1.5
>Reporter: Matthew Phillips
>Assignee: Julien Vermillard
> Fix For: 2.0.0-M3
>
> Attachments: mina_test_1.1.zip, mina_test_2.0.zip
>
>
> Porting from MINA 1.1 to 2.0M2, one of my tests pinpointed a huge drop in the 
> performance of a CumulativeProtocolDecoder-based codec when transmitting 
> large (>2MB) messages. Under MINA 1.1.x, the transfer was effectively 
> immediate, under 2.0M2 the system seems to get bogged down and calls the 
> doDecode () method in 1-second intervals, with slowly-increasing amounts of 
> data. The system is fast quad-core system, lightly unloaded. Almost no CPU 
> usage is recorded when running either test.
> The two attached ZIP files have test cases against 1.1 and 2.0M2 that 
> reproduce this. Their trace outputs follow: the MINA 1.1 case terminates in 
> under a second, the 2.0 case times out and fails after 5 seconds.
> MINA 1.1
> --
> time   = 14:04:00.934
> remaining  = 1024
> limit  = 1024
> capacity   = 1024
> time   = 14:04:00.946
> remaining  = 3072
> limit  = 3072
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 7168
> limit  = 7168
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 15360
> limit  = 15360
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 31744
> limit  = 31744
> capacity   = 4194304
> time   = 14:04:00.947
> remaining  = 64512
> limit  = 64512
> capacity   = 4194304
> time   = 14:04:00.948
> remaining  = 130048
> limit  = 130048
> capacity   = 4194304
> time   = 14:04:00.948
> remaining  = 261120
> limit  = 261120
> capacity   = 4194304
> time   = 14:04:00.949
> remaining  = 523264
> limit  = 523264
> capacity   = 4194304
> time   = 14:04:00.951
> remaining  = 1047552
> limit  = 1047552
> capacity   = 4194304
> time   = 14:04:00.953
> remaining  = 1571840
> limit  = 1571840
> capacity   = 4194304
> time   = 14:04:00.954
> remaining  = 2096128
> limit  = 2096128
> capacity   = 4194304
> time   = 14:04:00.955
> remaining  = 2097156
> limit  = 2097156
> capacity   = 4194304
> frame size = 2097152
> 2.0
> --
> time   = 14:01:59.154
> remaining  = 2048
> limit  = 2048
> capacity   = 2048
> time   = 14:01:59.175
> remaining  = 6144
> limit  = 6144
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 14336
> limit  = 14336
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 30720
> limit  = 30720
> capacity   = 2097156
> time   = 14:01:59.176
> remaining  = 63488
> limit  = 63488
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 129024
> limit  = 129024
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 194560
> limit  = 194560
> capacity   = 2097156
> time   = 14:01:59.177
> remaining  = 196608
> limit  = 196608
> capacity   = 2097156
> time   = 14:02:00.153
> remaining  = 262144
> limit  = 262144
> capacity   = 2097156
> time   = 14:02:00.153
> remaining  = 294912
> limit  = 294912
> capacity   = 2097156
> time   = 14:02:01.153
> remaining  = 360448
> limit  = 360448
> capacity   = 2097156
> time   = 14:02:01.154
> remaining  = 393216
> limit  = 393216
> capacity   = 2097156
> time   = 14:02:02.154
> remaining  = 458752
> limit  = 458752
> capacity   = 2097156
> time   = 14:02:02.154
> remaining  = 491520
> limit  = 491520
> capacity   = 2097156
> time   = 14:02:03.154
> remaining  = 557056
> limit  = 557056
> capacity   = 2097156
> time   = 14:02:03.155
> remaining  = 589824
> limit  = 589824
> capacity   = 2097156
> time   = 14:02:04.154
> remaining  = 655360
> limit  = 655360
> capacity   = 2097156
> time   = 14:02:04.155
> remaining  = 688128
> limit  = 688128
> capacity   = 2097156

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-14 Thread Serge Baranov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DIRMINA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Baranov updated DIRMINA-663:
--

Description: 
To reproduce get the mina_test_2.0.zip from DIRMINA-609.

It seems to be working even worse with the current trunk version, it takes ~5 
seconds between messages and then times out after the first message, if wait is 
increased to 10 seconds, it times out after the second message:

time   = 03:49:36.320
remaining  = 1024
limit  = 1024
capacity   = 2048
time   = 03:49:41.619
remaining  = 2048
limit  = 2048
capacity   = 2097156

java.lang.AssertionError: No message received

The issue differs from DIRMINA-609 as it affects not only large messages, but 
any message that doesn't fit into buffer (when doDecode returns false) and also 
all the subsequent messages (when doDecode returns true).
It's a serious issue making CumulativeProtocolDecoder completely useless.

  was:
To reproduce get the mina_test_2.0.zip from DIRMINA-609.

It seems to be working even worse with the current trunk version, it takes ~5 
seconds between messages and then times out after the first message, if wait is 
increased to 10 seconds, it times out after the second message:

time   = 03:49:36.320
remaining  = 1024
limit  = 1024
capacity   = 2048
time   = 03:49:41.619
remaining  = 2048
limit  = 2048
capacity   = 2097156

java.lang.AssertionError: No message received

The issue differs from DIRMINA-609 as it affects not only large messages, but 
any message that doesn't fit into buffer (when doDecode return false) and also 
all the subsequent messages (when doDecode returns true).
It's a serious issue making CumulativeProtocolDecoder completely useless.


> CumulativeProtocolDecoder doDecode performance problem
> --
>
> Key: DIRMINA-663
> URL: https://issues.apache.org/jira/browse/DIRMINA-663
> Project: MINA
>  Issue Type: Bug
>  Components: Filter
>Affects Versions: 2.0.0-RC1
> Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 
> 2.0.0-RC1 (up to current date trunk).
>Reporter: Serge Baranov
>Priority: Critical
>
> To reproduce get the mina_test_2.0.zip from DIRMINA-609.
> It seems to be working even worse with the current trunk version, it takes ~5 
> seconds between messages and then times out after the first message, if wait 
> is increased to 10 seconds, it times out after the second message:
> time   = 03:49:36.320
> remaining  = 1024
> limit  = 1024
> capacity   = 2048
> time   = 03:49:41.619
> remaining  = 2048
> limit  = 2048
> capacity   = 2097156
> java.lang.AssertionError: No message received
> The issue differs from DIRMINA-609 as it affects not only large messages, but 
> any message that doesn't fit into buffer (when doDecode returns false) and 
> also all the subsequent messages (when doDecode returns true).
> It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (DIRMINA-663) CumulativeProtocolDecoder doDecode performance problem

2009-02-14 Thread Serge Baranov (JIRA)

CumulativeProtocolDecoder doDecode performance problem
--

 Key: DIRMINA-663
 URL: https://issues.apache.org/jira/browse/DIRMINA-663
 Project: MINA
  Issue Type: Bug
  Components: Filter
Affects Versions: 2.0.0-RC1
 Environment: JDK 1.6.0_12 32-bit, Windows Vista 64-bit, MINA 2.0.0-RC1 
(up to current date trunk).
Reporter: Serge Baranov
Priority: Critical


To reproduce get the mina_test_2.0.zip from DIRMINA-609.

It seems to be working even worse with the current trunk version, it takes ~5 
seconds between messages and then times out after the first message, if wait is 
increased to 10 seconds, it times out after the second message:

time   = 03:49:36.320
remaining  = 1024
limit  = 1024
capacity   = 2048
time   = 03:49:41.619
remaining  = 2048
limit  = 2048
capacity   = 2097156

java.lang.AssertionError: No message received

The issue differs from DIRMINA-609 as it affects not only large messages, but 
any message that doesn't fit into buffer (when doDecode return false) and also 
all the subsequent messages (when doDecode returns true).
It's a serious issue making CumulativeProtocolDecoder completely useless.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

37 matches

Mail list logo