[ 
https://issues.apache.org/jira/browse/HTTPCORE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737406#comment-17737406
 ] 

Malay Shah commented on HTTPCORE-753:
-------------------------------------

[~olegk] I think I have found the race condition. It is two threads calling 
into IOSessionImpl's SetEvent and Close concurrently. Here is the code for the 
two methods side-by-side to show the sequence that leads to the problem:

 

{{Consumer Thread                                                         IO 
Reactor Thread}}
{{----------------------------------------------------------              
----------------------------------------------------------    }}
{{@Override}}
{{public void setEvent(final int op) {}}
{{    lock.lock();}}
{{    try {}}
{{        if (isStatusClosed()) {}}
{{            return;}}
{{        }}}
{{                                                                        
@Override                                                                       
}}
{{                                                                        
public void close(final CloseMode closeMode) {}}
{{                                                                            
if (this.status.compareAndSet(Status.ACTIVE, Status.CLOSED)) {}}
{{                                                                              
  if (closeMode == CloseMode.IMMEDIATE) {}}
{{                                                                              
      try {}}
{{                                                                              
          this.channel.socket().setSoLinger(true, 0);}}
{{                                                                              
      } catch (final SocketException e) {}}
{{                                                                              
          // Quietly ignore}}
{{                                                                              
      }}}
{{                                                                              
  }}}
{{                                                                              
  this.key.cancel();}}
{{        this.key.interestOps(this.key.interestOps() | op);}}
{{    } finally {}}
{{        lock.unlock();}}
{{    }}}
{{    this.key.selector().wakeup();}}
{{}}}
{{                                                                              
  this.key.attach(null);}}
{{                                                                              
  Closer.closeQuietly(this.key.channel());}}
{{                                                                              
  if (this.key.selector().isOpen()) {}}
{{                                                                              
      this.key.selector().wakeup();}}
{{                                                                              
  }}}
{{                                                                            
}}}
{{                                                                        }}}

 

Basically, the IO Reactor thread cancels the Key, and then the consumer thread, 
in the process of doing a CapacityChannel.update, calls setEvent.

One solution may be to lock in the close method.

> race condition with CapacityWindow resulting in CancelKeyException
> ------------------------------------------------------------------
>
>                 Key: HTTPCORE-753
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-753
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore
>    Affects Versions: 5.1.3
>            Reporter: Malay Shah
>            Priority: Major
>
> We have found a race condition AbstractHttp1StreamDuplexer where the 
> IOSessionImpl is closed off (cancelling it's SelectionKey) first before 
> CapacityWindow (which also has a reference to the IOSession) is closed. This 
> creates a window of opportunity where if a ResponseConsumer (something like 
> AbstractClassicEntityConsumer) holds a reference to the CapacityWindow and 
> tries to call update(), it will result in a CancelledKeyException being 
> thrown. We are running into this issue fairly regularly.
> Here is what I believe the fix is: in AbstractHttp1StreamDuplexer.onInput the 
> following two lines should be swapped:
> {{dataEnd(contentDecoder.getTrailers());}}
> {{capacityWindow.close();}}
>  
> Close the capacityWindow first before calling dataEnd which closes off the 
> connection, thus the capacityWindow will always be safe to use.
>  
> In case this is helpful, here is the callstack for when the connection is 
> closed and the SelectionKey becomes invalid:
> {{close:266, IOSessionImpl (org.apache.hc.core5.reactor)}}
> {{close:266, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{close:254, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{shutdownSession:157, AbstractHttp1StreamDuplexer 
> (org.apache.hc.core5.http.impl.nio)}}
> {{close:116, ClientHttp1StreamDuplexer$1 (org.apache.hc.core5.http.impl.nio)}}
> {{dataEnd:277, ClientHttp1StreamHandler (org.apache.hc.core5.http.impl.nio)}}
> {{dataEnd:366, ClientHttp1StreamDuplexer (org.apache.hc.core5.http.impl.nio)}}
> {{onInput:333, AbstractHttp1StreamDuplexer 
> (org.apache.hc.core5.http.impl.nio)}}
> {{inputReady:64, AbstractHttp1IOEventHandler 
> (org.apache.hc.core5.http.impl.nio)}}
> {{inputReady:39, ClientHttp1IOEventHandler 
> (org.apache.hc.core5.http.impl.nio)}}
> {{onIOEvent:131, InternalDataChannel (org.apache.hc.core5.reactor)}}
> {{handleIOEvent:51, InternalChannel (org.apache.hc.core5.reactor)}}
> {{processEvents:178, SingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{doExecute:127, SingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{execute:85, AbstractSingleCoreIOReactor (org.apache.hc.core5.reactor)}}
> {{run:44, IOReactorWorker (org.apache.hc.core5.reactor)}}
> {{run:829, Thread (java.lang)}}
>  
> In the interim, is it safe to catch and discard the CancelledKeyException 
> when calling update on the CapacityChannel?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to