HttpCore NIO hurt by JDK bug?

Harold Lee Tue, 13 Jul 2010 13:36:50 -0700

Regarding this JDK bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933

I think we are experiencing this using HttpCore on Linux with Java
1.6. We wind up leaking socket descriptors until the JVM process runs
out. We also wind up having to start a new reactor thread, which
creates a new Selector. The old reactor thread keeps running and the
thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait as
reported by others in the bug report above.

Here's the change that the Glassfish team made to work around this JDK bug:

http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9

>From my reading, the Glassfish code is much simpler than the HttpCore
NIO code: they're registering interest for just 1 socket and using
Selector.select() to wait for data from that socket. For HttpCore NIO,
it isn't yet clear to me how we can detect which selector is "trashed"
in order to cancel it and recreate it.

I'm working on a workaround in AbstractMultiworkerIOReactor.java. If
selector.select returns 0 (setting readyCount to 0) then we don't know
whether this bug hit us or we just had a timeout. To be safe, I think
we need to close every registered SelectorKey and then call
selector.selectNow() to flush them. Then we can create a new
SelectorKey for each and reregister them. The only way to make it less
common, I think, is to use a long selectTimeout value so that the odds
of a timeout are low. Ugly, but I hope it will work.

Harold

PS I enjoy using HttpCore NIO, nice work folks, thanks for this project!

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

HttpCore NIO hurt by JDK bug?

Reply via email to