https://bz.apache.org/bugzilla/show_bug.cgi?id=66841

            Bug ID: 66841
           Summary: Memory leak from cancelled async http/2 streams
           Product: Tomcat 9
           Version: 9.0.76
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: critical
          Priority: P2
         Component: Connectors
          Assignee: dev@tomcat.apache.org
          Reporter: ajmurphy...@gmail.com
  Target Milestone: -----

Regression caused by fixed for
https://bz.apache.org/bugzilla/show_bug.cgi?id=63816

Have application using Spring server sent events for notifying logged in users
of application events. SSE emitters have timeout of 1 day and so are active for
the whole user session. Application tests registered SSE emitters at regular
intervals to confirm the emitter is still alive and perform application cleanup
if the emitter connection has been closed for any reason. Test is done by
sending heartbeat message through each SSE emitter. 

If the SSE connection is cancelled from the remote client, the stream for the
SSE emitter is set to a state of CLOSED_RST_RX which causes the heartbeat test
message to receive a CloseNowException, this is expected based on our
understanding of http/2 and tomcat. However the stream does not actually get
closed because the asyncStateMachine is not updated and remains in a state of
STARTED. This keeps the stream, stream processor, and associated resources from
being closed and garbage collected. Streams in this state are only cleaned up
when their asyncTimeout expires and the AbstractProtocol's timeoutFuture calls
the processor's timeoutAsync method.

The memory used by the closed streams is leaked until they timeout.

The streams appear to be stuck in this state because when the application
attempts to send the heartbeat message the Http2UpgradeHandler calls
doStreamCancel when reserving the window size because the stream state does not
allow writing. doStreamCancel sets the error flag on the stream's
coyoteResponse (Stream.java line 275) and returns the CloseNowException. After
the CloseNowException is returned the AbstractProcessor attempts to set the
error state to CLOSE_NOW (AbstractProcessor line 241) but this fails to update
the asyncStateMachine to an ERROR state because the if condition depends on the
setError flag being true. But the setError flag can never be true in this
scenario because the response error flag had already been set in
doStreamCancel. This means the asyncStateMachine is never set to an error state
or told to process an ERROR SocketEvent.

boolean setError = response.setError();
        boolean blockIo = this.errorState.isIoAllowed() &&
!errorState.isIoAllowed();
        this.errorState = this.errorState.getMostSevere(errorState);
        // Don't change the status code for IOException since that is almost
        // certainly a client disconnect in which case it is preferable to keep
        // the original status code
http://markmail.org/message/4cxpwmxhtgnrwh7n
        if (response.getStatus() < 400 && !(t instanceof IOException)) {
            response.setStatus(500);
        }
        if (t != null) {
            request.setAttribute(RequestDispatcher.ERROR_EXCEPTION, t);
        }
        if (blockIo && isAsync() && setError) { <- setError is always false
            if (asyncStateMachine.asyncError()) {
                processSocketEvent(SocketEvent.ERROR, true);
            }
        }

We deployed a patched version of AbstractProcessor with the if condition like
this:

if (blockIo && isAsync() && !asyncStateMachine.isAsyncError()) {

Our assumption is that whether the asyncStateMachine should process an ERROR
event should depend on if the asyncStateMachine is already in an error state
not on if the coyote Response is already in an error state.

Connector settings used:
<Connector port="8443" protocol="org.apache.coyote.http11.Http11Nio2Protocol"
               connectionTimeout="20000" maxThreads="150"
               SSLEnabled="true" scheme="https" secure="true"
useAsyncIO="false" >
        <UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" />
        <SSLHostConfig>
                        <Certificate
certificateKeystoreFile="conf/certs/devkeystore"
                                                 type="RSA"/>
        </SSLHostConfig>
    </Connector>

Note: issue occurred with NIO and APR protocol as well. useAsyncIO had to be
set to false because we encountered a different bug when it was set to true.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to