[ 
https://issues.apache.org/jira/browse/HTTPCORE-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653688#action_12653688
 ] 

Marc Beyerle commented on HTTPCORE-155:
---------------------------------------

Hi, folks!

My name is Marc Beyerle and I am from IBM. I'm not from the JVM development, so 
please don't blame me :-) I am from an IBM System z  (aka mainframe) customer 
center, and one of our customers is currently running into the above described 
problem. They also use Apache Synapse with the IBM JVM, like Asankha. I dug 
into this and found that the root cause is a "specification hole". If you read 
the Sun API here: 
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/SelectionKey.html you 
will find this: "... In a naive implementation, reading or writing the interest 
set may block indefinitely if a selection operation is already in progress". 
And exactly this is happening here. While 
org.apache.http.impl.nio.reactor.AbstractIOReactor does a select(), the 
IOSession cannot perform an interestOps(ops) operation.

My suggestion is to synchronize around this, such that a deadlock does not 
occur. I wrote a fix for this against 4.0-beta1 and I will attach the files to 
this issue. I tested it with both the IBM JVM and the Sun JVM and it works just 
fine. You will notice that I had to reduce the timeout for the select() 
operation down to the minimum (1ms), in order to make things reactive. Also, 
before each select() operation, I check whether one of the IOSessions is 
waiting to perform an interestOps(ops) operation. In the loop around select(), 
I start looking for waiting IOSessions after the last waiting one, in order not 
to privilege the IOSessions at the beginning of the Iterator.

I really hope you find this patch useful and you cannot imagine how much work 
it took me to get all the IBM approvals to post the patch to this issue here.

Cheers,
Marc

> Performance issues with IBM JRE 6.0
> -----------------------------------
>
>                 Key: HTTPCORE-155
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-155
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore NIO
>    Affects Versions: 4.0-beta1
>         Environment: Windows 2003 SP2 - IBM J2RE 1.6.0 build 2.4 - HTTPCore 
> Beta1 - Dual Core CPU 3.0Ghz - 1Gbps networking
>            Reporter: Tom McSorley
>             Fix For: 4.1
>
>         Attachments: javacore.20081203.153723.32300.0001.txt
>
>
> I'm issuing a second HTTP Request on a connection that has very recently 
> returned a null for the submitRequest() call...  this 2nd request is being 
> issued approximately 500ms after the submitRequest() null is returned... so 
> the connection has just been established, an HTTP Request/Response-200 cycle 
> has completed just prior to this 2nd request being issued.  I'm seeing 
> unusually long delays in the requestOutput() call (verified by surrounding 
> timing prints)... that can range anywhere from a few milliseconds on up to 60 
> seconds...  It eventually unwinds, and then the submitRequest() is called... 
> this 2nd request is dispatched and works fine... but, it is delayed 
> considerably...  Is this a known issue and is there a possible work-around?
> Here's the JVM related thread information:
> The thread being delayed and stuck in the requestOutput() call for a long 
> time (mostly longer than 5 seconds):
> 3XMTHREADINFO      "pool-2-thread-5" TID:0x2AEECE00, j9thread_t:0x2A7189A8, 
> state:B, prio=5
> 3XMTHREADINFO1            (native thread ID:0x1B44, native priority:0x5, 
> native policy:UNKNOWN)
> 4XESTACKTRACE          at 
> sun/nio/ch/SelectionKeyImpl.interestOps(SelectionKeyImpl.java:60)
> 4XESTACKTRACE          at 
> org/apache/http/impl/nio/reactor/IOSessionImpl.setEvent(IOSessionImpl.java:113)
> 4XESTACKTRACE          at 
> org/apache/http/impl/nio/NHttpConnectionBase.requestOutput(NHttpConnectionBase.java:158)
> .... (non important stack information removed)
> 4XESTACKTRACE          at 
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> 4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
> Here's the monitor that this thread is blocked and waiting on:
> 2LKMONINUSE      sys_mon_t:0x2A708AF8 infl_mon_t: 0x2A708B30:
> 3LKMONOBJECT       sun/nio/ch/[EMAIL PROTECTED]/00B09214: Flat locked by "I/O 
> dispatcher 7" (0x2A208E00), entry count 1
> 3LKWAITERQ            Waiting to enter:
> 3LKWAITER                "pool-2-thread-5" (0x2AEECE00)
> And here's the thread that currently has this monitor locked:
> 3XMTHREADINFO      "I/O dispatcher 7" TID:0x2A208E00, j9thread_t:0x2A6EC73C, 
> state:R, prio=5
> 3XMTHREADINFO1            (native thread ID:0x830, native priority:0x5, 
> native policy:UNKNOWN)
> 4XESTACKTRACE          at 
> sun/nio/ch/WindowsSelectorImpl$SubSelector.poll0(Native Method)
> 4XESTACKTRACE          at 
> sun/nio/ch/WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:308(Compiled
>  Code))
> 4XESTACKTRACE          at 
> sun/nio/ch/WindowsSelectorImpl$SubSelector.access$500(WindowsSelectorImpl.java(Compiled
>  Code))
> 4XESTACKTRACE          at 
> sun/nio/ch/WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:162(Compiled 
> Code))
> 4XESTACKTRACE          at 
> sun/nio/ch/SelectorImpl.lockAndDoSelect(SelectorImpl.java:69(Compiled Code))
> 4XESTACKTRACE          at 
> sun/nio/ch/SelectorImpl.select(SelectorImpl.java:80(Compiled Code))
> 4XESTACKTRACE          at 
> org/apache/http/impl/nio/reactor/AbstractIOReactor.execute(AbstractIOReactor.java:121)
> 4XESTACKTRACE          at 
> org/apache/http/impl/nio/reactor/BaseIOReactor.execute(BaseIOReactor.java:70)
> 4XESTACKTRACE          at 
> org/apache/http/impl/nio/reactor/AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:318)
> 4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
> I should also note that we're attempting to use 1000 client instances on this 
> single system... each with potentially 2 active connections simultaneously... 
> there is also virtually no CPU load (i.e. less then 5%) on this system...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to