Re: HttpCore NIO hurt by JDK bug?

swatkatz Mon, 01 Nov 2010 13:13:26 -0700

It looks like we can consistently reproduce this whenever there is a failed
connection. There seems to be a file descriptor leak everytime there is a
failed connection.


We registered a SessionRequestCallback and each time public void
failed(final SessionRequest request) is called we seem to have a file
descriptor leak.

Perhaps there is a bug where the descriptor is not cleaned up properly if
there is a failure connecting ?



Hiranya Jayathilaka-3 wrote:
> 
> On Tue, Nov 2, 2010 at 12:48 AM, swatkatz <[email protected]> wrote:
>>
>> Hello,
>>
>> We seem to be experiencing this as well when using NIO. We are using JDK
>> 1.6
>> Update 21.
> 
> This bug should be fixed in JDK 1.6 build 21. At least that's what all
> the evidence suggest. We haven't been able to reproduce the issue on
> this particular JDK version ever.
> 
> Thanks,
> Hiranya
> 
>  Any ideas what the workaround/fix is ?
>>
>> Regards,
>> Mohan
>>
>>
>>
>> olegk wrote:
>>>
>>> On Thu, 2010-07-15 at 12:50 -0700, Harold Lee wrote:
>>>> I've put together a simple HTTP server that resets the connection
>>>> after sending part of the response back to the client. I'm going to
>>>> try to recreate the bug (leaking sockets) by making many requests
>>>> against that server from a Linux box. I'll let you know what I find.
>>>>
>>>> Harold
>>>>
>>>
>>>
>>>
>>>> On Wed, Jul 14, 2010 at 1:44 AM, Oleg Kalnichevski <[email protected]>
>>>> wrote:
>>>> > On Tue, 2010-07-13 at 13:32 -0700, Harold Lee wrote:
>>>> >> Regarding this JDK bug:
>>>> >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
>>>> >>
>>>> >> I think we are experiencing this using HttpCore on Linux with Java
>>>> >> 1.6. We wind up leaking socket descriptors until the JVM process
>>>> runs
>>>> >> out. We also wind up having to start a new reactor thread, which
>>>> >> creates a new Selector. The old reactor thread keeps running and the
>>>> >> thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait as
>>>> >> reported by others in the bug report above.
>>>> >>
>>>> >
>>>
>>>
>>> Hi Harold
>>>
>>> Did you have any luck reproducing the problem?
>>>
>>> I put together a work-around for the bug that causes the epoll spin
>>> problem [1]. If you are interested in trying it out I will happily share
>>> it with you. The work-around is pretty ugly, so I want to be sure there
>>> is no other way of solving the issue.
>>>
>>> cheers
>>>
>>> Oleg
>>>
>>> [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
>>>
>>>> > Folks
>>>> >
>>>> > Anyone experienced anything like that? The looks pretty old, but
>>>> there
>>>> > has been no reports of similar problems with HttpCore NIO. I am using
>>>> > Linux / JDK 1.6 on a daily basis when hacking on HttpCore but I have
>>>> not
>>>> > encountered such a problem yet.
>>>> >
>>>> >
>>>> >> Here's the change that the Glassfish team made to work around this
>>>> JDK
>>>> bug:
>>>> >>
>>>> >>
>>>> http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9
>>>> >>
>>>> >> From my reading, the Glassfish code is much simpler than the
>>>> HttpCore
>>>> >> NIO code: they're registering interest for just 1 socket and using
>>>> >> Selector.select() to wait for data from that socket. For HttpCore
>>>> NIO,
>>>> >> it isn't yet clear to me how we can detect which selector is
>>>> "trashed"
>>>> >> in order to cancel it and recreate it.
>>>> >>
>>>> >> I'm working on a workaround in AbstractMultiworkerIOReactor.java. If
>>>> >> selector.select returns 0 (setting readyCount to 0) then we don't
>>>> know
>>>> >> whether this bug hit us or we just had a timeout.
>>>> >
>>>> > The problem is that it is perfectly valid for a selector to return 0
>>>> > ready count. This condition alone is not sufficient to assume the
>>>> > selector is trashed.
>>>> >
>>>> >
>>>> >>  To be safe, I think
>>>> >> we need to close every registered SelectorKey and then call
>>>> >> selector.selectNow() to flush them. Then we can create a new
>>>> >> SelectorKey for each and reregister them. The only way to make it
>>>> less
>>>> >> common, I think, is to use a long selectTimeout value so that the
>>>> odds
>>>> >> of a timeout are low. Ugly, but I hope it will work.
>>>> >>
>>>> >
>>>> > This will unfortunately screw up handling of new / closed channels as
>>>> > well timeout logic.
>>>> >
>>>> > The work-around looks butt ugly and would require tons of fairly
>>>> complex
>>>> > code. Is there a way to reproduce the issue with a test scenario, so
>>>> we
>>>> > could look for alternative approaches?
>>>> >
>>>> > Cheers
>>>> >
>>>> > Oleg
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [email protected]
>>>> For additional commands, e-mail: [email protected]
>>>>
>>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/HttpCore-NIO-hurt-by-JDK-bug--tp29155405p30107703.html
>> Sent from the HttpComponents-Dev mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
> 
> 
> 
> -- 
> Hiranya Jayathilaka
> Senior Software Engineer;
> WSO2 Inc.;  http://wso2.org
> E-mail: [email protected];  Mobile: +94 77 633 3491
> Blog: http://techfeast-hiranya.blogspot.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/HttpCore-NIO-hurt-by-JDK-bug--tp29155405p30108164.html
Sent from the HttpComponents-Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: HttpCore NIO hurt by JDK bug?

Reply via email to