Regarding this JDK bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
I think we are experiencing this using HttpCore on Linux with Java 1.6. We wind up leaking socket descriptors until the JVM process runs out. We also wind up having to start a new reactor thread, which creates a new Selector. The old reactor thread keeps running and the thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait as reported by others in the bug report above. Here's the change that the Glassfish team made to work around this JDK bug: http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9 >From my reading, the Glassfish code is much simpler than the HttpCore NIO code: they're registering interest for just 1 socket and using Selector.select() to wait for data from that socket. For HttpCore NIO, it isn't yet clear to me how we can detect which selector is "trashed" in order to cancel it and recreate it. I'm working on a workaround in AbstractMultiworkerIOReactor.java. If selector.select returns 0 (setting readyCount to 0) then we don't know whether this bug hit us or we just had a timeout. To be safe, I think we need to close every registered SelectorKey and then call selector.selectNow() to flush them. Then we can create a new SelectorKey for each and reregister them. The only way to make it less common, I think, is to use a long selectTimeout value so that the odds of a timeout are low. Ugly, but I hope it will work. Harold PS I enjoy using HttpCore NIO, nice work folks, thanks for this project! --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
