Re: HttpCore NIO hurt by JDK bug?

Supun Kamburugamuva Mon, 02 Aug 2010 22:00:55 -0700

Hi Harold,

We are having the same problem and we are interested to know how you get
this exact Java version. Is there a link to this particular Java version?


Thanks,
Supun..

On Tue, Aug 3, 2010 at 1:57 AM, Harold Lee <[email protected]> wrote:

> This seems to be fixed by a newer version of the JRE, i.e.
>
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-b105, mixed mode)
>
> So I think that you can avoid the tricky workaround. Thank you for
> your time and attention.
>
> Harold
>
> On Sat, Jul 31, 2010 at 7:11 AM, Oleg Kalnichevski <[email protected]>
> wrote:
> > On Thu, 2010-07-15 at 12:50 -0700, Harold Lee wrote:
> >> I've put together a simple HTTP server that resets the connection
> >> after sending part of the response back to the client. I'm going to
> >> try to recreate the bug (leaking sockets) by making many requests
> >> against that server from a Linux box. I'll let you know what I find.
> >>
> >> Harold
> >>
> >
> >
> >
> >> On Wed, Jul 14, 2010 at 1:44 AM, Oleg Kalnichevski <[email protected]>
> wrote:
> >> > On Tue, 2010-07-13 at 13:32 -0700, Harold Lee wrote:
> >> >> Regarding this JDK bug:
> >> >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
> >> >>
> >> >> I think we are experiencing this using HttpCore on Linux with Java
> >> >> 1.6. We wind up leaking socket descriptors until the JVM process runs
> >> >> out. We also wind up having to start a new reactor thread, which
> >> >> creates a new Selector. The old reactor thread keeps running and the
> >> >> thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait as
> >> >> reported by others in the bug report above.
> >> >>
> >> >
> >
> >
> > Hi Harold
> >
> > Did you have any luck reproducing the problem?
> >
> > I put together a work-around for the bug that causes the epoll spin
> > problem [1]. If you are interested in trying it out I will happily share
> > it with you. The work-around is pretty ugly, so I want to be sure there
> > is no other way of solving the issue.
> >
> > cheers
> >
> > Oleg
> >
> > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
> >
> >> > Folks
> >> >
> >> > Anyone experienced anything like that? The looks pretty old, but there
> >> > has been no reports of similar problems with HttpCore NIO. I am using
> >> > Linux / JDK 1.6 on a daily basis when hacking on HttpCore but I have
> not
> >> > encountered such a problem yet.
> >> >
> >> >
> >> >> Here's the change that the Glassfish team made to work around this
> JDK bug:
> >> >>
> >> >>
> http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9
> >> >>
> >> >> From my reading, the Glassfish code is much simpler than the HttpCore
> >> >> NIO code: they're registering interest for just 1 socket and using
> >> >> Selector.select() to wait for data from that socket. For HttpCore
> NIO,
> >> >> it isn't yet clear to me how we can detect which selector is
> "trashed"
> >> >> in order to cancel it and recreate it.
> >> >>
> >> >> I'm working on a workaround in AbstractMultiworkerIOReactor.java. If
> >> >> selector.select returns 0 (setting readyCount to 0) then we don't
> know
> >> >> whether this bug hit us or we just had a timeout.
> >> >
> >> > The problem is that it is perfectly valid for a selector to return 0
> >> > ready count. This condition alone is not sufficient to assume the
> >> > selector is trashed.
> >> >
> >> >
> >> >>  To be safe, I think
> >> >> we need to close every registered SelectorKey and then call
> >> >> selector.selectNow() to flush them. Then we can create a new
> >> >> SelectorKey for each and reregister them. The only way to make it
> less
> >> >> common, I think, is to use a long selectTimeout value so that the
> odds
> >> >> of a timeout are low. Ugly, but I hope it will work.
> >> >>
> >> >
> >> > This will unfortunately screw up handling of new / closed channels as
> >> > well timeout logic.
> >> >
> >> > The work-around looks butt ugly and would require tons of fairly
> complex
> >> > code. Is there a way to reproduce the issue with a test scenario, so
> we
> >> > could look for alternative approaches?
> >> >
> >> > Cheers
> >> >
> >> > Oleg
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Tech Lead, WSO2 Inc
http://wso2.org
supunk.blogspot.com

Re: HttpCore NIO hurt by JDK bug?

Reply via email to