[
https://issues.apache.org/jira/browse/HTTPCORE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737406#comment-17737406
]
Malay Shah commented on HTTPCORE-753:
-------------------------------------
[~olegk] I think I have found the race condition. It is two threads calling
into IOSessionImpl's SetEvent and Close concurrently. Here is the code for the
two methods side-by-side to show the sequence that leads to the problem:
{{Consumer Thread IO
Reactor Thread}}
{{----------------------------------------------------------
---------------------------------------------------------- }}
{{@Override}}
{{public void setEvent(final int op) {}}
{{ lock.lock();}}
{{ try {}}
{{ if (isStatusClosed()) {}}
{{ return;}}
{{ }}}
{{
@Override
}}
{{
public void close(final CloseMode closeMode) {}}
{{
if (this.status.compareAndSet(Status.ACTIVE, Status.CLOSED)) {}}
{{
if (closeMode == CloseMode.IMMEDIATE) {}}
{{
try {}}
{{
this.channel.socket().setSoLinger(true, 0);}}
{{
} catch (final SocketException e) {}}
{{
// Quietly ignore}}
{{
}}}
{{
}}}
{{
this.key.cancel();}}
{{ this.key.interestOps(this.key.interestOps() | op);}}
{{ } finally {}}
{{ lock.unlock();}}
{{ }}}
{{ this.key.selector().wakeup();}}
{{}}}
{{
this.key.attach(null);}}
{{
Closer.closeQuietly(this.key.channel());}}
{{
if (this.key.selector().isOpen()) {}}
{{
this.key.selector().wakeup();}}
{{
}}}
{{
}}}
{{ }}}
Basically, the IO Reactor thread cancels the Key, and then the consumer thread,
in the process of doing a CapacityChannel.update, calls setEvent.
One solution may be to lock in the close method.
> race condition with CapacityWindow resulting in CancelKeyException
> ------------------------------------------------------------------
>
> Key: HTTPCORE-753
> URL: https://issues.apache.org/jira/browse/HTTPCORE-753
> Project: HttpComponents HttpCore
> Issue Type: Bug
> Components: HttpCore
> Affects Versions: 5.1.3
> Reporter: Malay Shah
> Priority: Major
>
> We have found a race condition AbstractHttp1StreamDuplexer where the
> IOSessionImpl is closed off (cancelling it's SelectionKey) first before
> CapacityWindow (which also has a reference to the IOSession) is closed. This
> creates a window of opportunity where if a ResponseConsumer (something like
> AbstractClassicEntityConsumer) holds a reference to the CapacityWindow and
> tries to call update(), it will result in a CancelledKeyException being
> thrown. We are running into this issue fairly regularly.
> Here is what I believe the fix is: in AbstractHttp1StreamDuplexer.onInput the
> following two lines should be swapped:
> {{dataEnd(contentDecoder.getTrailers());}}
> {{capacityWindow.close();}}
>
> Close the capacityWindow first before calling dataEnd which closes off the
> connection, thus the capacityWindow will always be safe to use.
>
> In case this is helpful, here is the callstack for when the connection is
> closed and the SelectionKey becomes invalid:
> {{close:266, IOSessionImpl (org.apache.hc.core5.reactor)}}
> {{close:266, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{close:254, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{shutdownSession:157, AbstractHttp1StreamDuplexer
> (org.apache.hc.core5.http.impl.nio)}}
> {{close:116, ClientHttp1StreamDuplexer$1 (org.apache.hc.core5.http.impl.nio)}}
> {{dataEnd:277, ClientHttp1StreamHandler (org.apache.hc.core5.http.impl.nio)}}
> {{dataEnd:366, ClientHttp1StreamDuplexer (org.apache.hc.core5.http.impl.nio)}}
> {{onInput:333, AbstractHttp1StreamDuplexer
> (org.apache.hc.core5.http.impl.nio)}}
> {{inputReady:64, AbstractHttp1IOEventHandler
> (org.apache.hc.core5.http.impl.nio)}}
> {{inputReady:39, ClientHttp1IOEventHandler
> (org.apache.hc.core5.http.impl.nio)}}
> {{onIOEvent:131, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{handleIOEvent:51, InternalChannel (org.apache.hc.core5.reactor)}}
> {{processEvents:178, SingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{doExecute:127, SingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{execute:85, AbstractSingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{run:44, IOReactorWorker (org.apache.hc.core5.reactor)}}
> {{run:829, Thread (java.lang)}}
>
> In the interim, is it safe to catch and discard the CancelledKeyException
> when calling update on the CapacityChannel?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]