[
https://issues.apache.org/jira/browse/HTTPCORE-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659172#action_12659172
]
mclark edited comment on HTTPCORE-155 at 12/24/08 4:05 PM:
------------------------------------------------------------------
Sam,
Thanks for that explanation, it helped me a lot. I agree that a change in the
VM implementation would be the cleanest resolution to this particular problem.
That, combined with a tightening up of the SelectionKey contract to remove
some of the "implementation dependent" ambiguity that the contract allows for.
Unfortunately, we are probably stuck with that contract for a while, as it's
been there for Java 1.4, 5 and now 6.
A couple things that your message clarified for me:
1) I had not seen the doc comment about naive vs. high-performance
implementations
1.1) And so, I did not realize that a reduction in the length of time that the
#interestOps(int) method blocks, can be achieved by ensuring that
#interestOps(int) is invoked only when no thread(s) are in #select(). Well, at
least the docs hint fairly strongly at this.
2) It explains to me why Marc's patch relieves the symptoms of this particular
problem: by making sure that #interestOps(int) is invoked only by the thread
that is #select()ing, he can serialize the calls to #interestOps(int) and
#select(), which ensures that their invocation is mutually exclusive.
And so I think Marc's patch is getting close to the workaround you outlined,
Sam? Now that I understand the problem more thoroughly (hopefully), I can
perhaps offer some further review on his changes:
As I mentioned before, Marc's patch is ignoring the selectTimeout member var of
AbstractIOReactor -- this will need to be addressed somehow (probably by
reintroducing its use.) But why did he make that change? The only reason I
can see for such an aggressive selectTimeout as 1ms, is the need to have the
reactor wake up, retrieve, and process pending interestOps queue items, even
when the reactor has no reason to wake up out #select.
Under Marc's changes, requests to change interestOps will not actually take
effect until the reactor wakes up for some reason and processes the interestOps
queue. Since enqueueing an interestOps request does not wake the reactor, it
is possible for very many interestOps requests to pile up and not be processed
in a timely fashion -- that is, if the reactor is asleep in #select(long), with
no reason to wake up. So, Marc has set the select timeout very low, which
guarantees that the reactor will give the interestOps queue a great deal of
attention, which helps get those interestOps requests processed in a timely
manner. However, all this spinning in the reactor execution loop is
inefficient. Instead of spinning/polling in the reactor, would it work to
invoke Selector#wakeup immediately after enqueueing an interestOps queue item?
Would this help the reactor could wake up in a timely fashion whenever there
are interestOps items in queue, but be able stay asleep if there are
none?
regards,
Mike
was (Author: mclark):
Sam,
Thanks for that explanation, it helped me a lot. I agree that a change in the
VM implementation would be the cleanest resolution to this particular problem.
That, combined with a tightening up of the SelectionKey contract to remove
some of the "implementation dependent" ambiguity that the contract allows for.
Unfortunately, we are probably stuck with that contract for a while, as it's
been there for Java 1.4, 5 and now 6.
A couple things that your message clarified for me:
1) I had not seen the doc comment about naive vs. high-performance
implementations
1.1) And so, I did not realize that a reduction in the length of time that the
#interestOps(int) method blocks, can be achieved by ensuring that
#interestOps(int) is invoked only when no thread(s) are in #select(). Well, at
least the docs hint fairly strongly at this.
2) It explains to me why Marc's patch relieves the symptoms of this particular
problem: by making sure that #interestOps(int) is invoked only by the thread
that is #select()ing, he can serialize the calls to #interestOps(int) and
#select(), which ensures that their invocation is mutually exclusive.
And so I think Marc's patch is getting close to the workaround you outlined,
Sam? Now that I understand the problem more thoroughly (hopefully), I can
perhaps offer some further review on his changes:
As I mentioned before, Marc's patch is ignoring the selectTimeout member var of
AbstractIOReactor -- this will need to be addressed somehow (probably by
reintroducing its use.) But why did he make that change? The only reason I
can see for such an aggressive selectTimeout as 1ms, is the need to have the
reactor wake up, retrieve, and process pending interestOps queue items, even
when the reactor has no reason to wake up out #select.
Requests to change interestOps will not actually take effect until the reactor
wakes up for some reason and processes the interestOps queue. Since enqueueing
interestOps requests does not wake the reactor, it is possible for very many
interestOps requests pile up and not be processed in a timely fashion -- if the
reactor is asleep in #select(long), with no reason to wake up. So, Marc has
set the select timeout very low, which guarantees that the reactor will give
the selectionOps queue a great deal of attention, which helps get those
interestOps requests processed in a timely manner. However, all this spinning
in the reactor execution loop is inefficient. Instead of spinning/polling in
the reactor, would it work to invoke Selector#wakeup immediately after
enqueueing an interestOps queue item? That way the reactor could wake up in a
timely fashion whenever there are interestOps items in queue, but be able stay
asleep if there are none.
regards,
Mike
> Performance issues with IBM JRE 6.0
> -----------------------------------
>
> Key: HTTPCORE-155
> URL: https://issues.apache.org/jira/browse/HTTPCORE-155
> Project: HttpComponents HttpCore
> Issue Type: Bug
> Components: HttpCore NIO
> Affects Versions: 4.0-beta1
> Environment: Windows 2003 SP2 - IBM J2RE 1.6.0 build 2.4 - HTTPCore
> Beta1 - Dual Core CPU 3.0Ghz - 1Gbps networking
> Reporter: Tom McSorley
> Fix For: 4.1
>
> Attachments: AbstractIOReactor.diff, AbstractIOReactor.java,
> IOSessionImpl.diff, IOSessionImpl.java,
> javacore.20081203.153723.32300.0001.txt, patch.08-12-17.tar.gz,
> patch.08-12-18.tar.gz, patch.08-12-22.tar.gz
>
>
> I'm issuing a second HTTP Request on a connection that has very recently
> returned a null for the submitRequest() call... this 2nd request is being
> issued approximately 500ms after the submitRequest() null is returned... so
> the connection has just been established, an HTTP Request/Response-200 cycle
> has completed just prior to this 2nd request being issued. I'm seeing
> unusually long delays in the requestOutput() call (verified by surrounding
> timing prints)... that can range anywhere from a few milliseconds on up to 60
> seconds... It eventually unwinds, and then the submitRequest() is called...
> this 2nd request is dispatched and works fine... but, it is delayed
> considerably... Is this a known issue and is there a possible work-around?
> Here's the JVM related thread information:
> The thread being delayed and stuck in the requestOutput() call for a long
> time (mostly longer than 5 seconds):
> 3XMTHREADINFO "pool-2-thread-5" TID:0x2AEECE00, j9thread_t:0x2A7189A8,
> state:B, prio=5
> 3XMTHREADINFO1 (native thread ID:0x1B44, native priority:0x5,
> native policy:UNKNOWN)
> 4XESTACKTRACE at
> sun/nio/ch/SelectionKeyImpl.interestOps(SelectionKeyImpl.java:60)
> 4XESTACKTRACE at
> org/apache/http/impl/nio/reactor/IOSessionImpl.setEvent(IOSessionImpl.java:113)
> 4XESTACKTRACE at
> org/apache/http/impl/nio/NHttpConnectionBase.requestOutput(NHttpConnectionBase.java:158)
> .... (non important stack information removed)
> 4XESTACKTRACE at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:735)
> Here's the monitor that this thread is blocked and waiting on:
> 2LKMONINUSE sys_mon_t:0x2A708AF8 infl_mon_t: 0x2A708B30:
> 3LKMONOBJECT sun/nio/ch/uti...@00b09208/00B09214: Flat locked by "I/O
> dispatcher 7" (0x2A208E00), entry count 1
> 3LKWAITERQ Waiting to enter:
> 3LKWAITER "pool-2-thread-5" (0x2AEECE00)
> And here's the thread that currently has this monitor locked:
> 3XMTHREADINFO "I/O dispatcher 7" TID:0x2A208E00, j9thread_t:0x2A6EC73C,
> state:R, prio=5
> 3XMTHREADINFO1 (native thread ID:0x830, native priority:0x5,
> native policy:UNKNOWN)
> 4XESTACKTRACE at
> sun/nio/ch/WindowsSelectorImpl$SubSelector.poll0(Native Method)
> 4XESTACKTRACE at
> sun/nio/ch/WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:308(Compiled
> Code))
> 4XESTACKTRACE at
> sun/nio/ch/WindowsSelectorImpl$SubSelector.access$500(WindowsSelectorImpl.java(Compiled
> Code))
> 4XESTACKTRACE at
> sun/nio/ch/WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:162(Compiled
> Code))
> 4XESTACKTRACE at
> sun/nio/ch/SelectorImpl.lockAndDoSelect(SelectorImpl.java:69(Compiled Code))
> 4XESTACKTRACE at
> sun/nio/ch/SelectorImpl.select(SelectorImpl.java:80(Compiled Code))
> 4XESTACKTRACE at
> org/apache/http/impl/nio/reactor/AbstractIOReactor.execute(AbstractIOReactor.java:121)
> 4XESTACKTRACE at
> org/apache/http/impl/nio/reactor/BaseIOReactor.execute(BaseIOReactor.java:70)
> 4XESTACKTRACE at
> org/apache/http/impl/nio/reactor/AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:318)
> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:735)
> I should also note that we're attempting to use 1000 client instances on this
> single system... each with potentially 2 active connections simultaneously...
> there is also virtually no CPU load (i.e. less then 5%) on this system...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]