Hello Experts

Several of my production servers were recently upgraded from Tomcat 9.0.14 to 
9.0.21; immediately after the upgrade the servers started accumulating memory 
and open-files (on Linux) in a steady trend that was not observed before.
After a couple of days (without reaching the memory or open-files limit and 
without throwing "OutOfMemoryError: Java heap space" or "IOException: Too many 
open files") the servers became unresponsive: any HTTPS request timed-out while 
HTTP requests continued to work correctly.
Restarting the servers resolved the symptoms but the behavior persists and a 
restart is necessary every couple of days.
I loaded a heap dump from an unresponsive server into MAT and received the 
following Leak Suspect:

105,871 instances of "org.apache.coyote.http2.Stream", loaded by 
"java.net.URLClassLoader..." occupy 7,581,549,904 (80.68%) bytes.
These instances are referenced from one instance of 
"java.util.concurrent.ConcurrentHashMap$Node[]", loaded by "<system class 
loader>"

The HashMap referenced in the report is "connections" inside ConnectionHandler.
I suspect that these objects accumulate as clients may not close their 
connections correctly; regardless, I'd expect Tomcat to close the connections 
upon timeout.
With keepAliveTimeout="20000" defined on UpgradeProtocol, I tested one simple 
HTTP2 connection's persistence on Chrome's net-internals.
With 9.0.14 I can see the following at 20 seconds (as expected):
...
t=7065701 [st=   64]    HTTP2_SESSION_UPDATE_RECV_WINDOW
                        --> delta = 6894
                        --> window_size = 15728640
t=7085708 [st=20071]    HTTP2_SESSION_PING
                        --> is_ack = false
                        --> type = "received"
                        --> unique_id = 2
t=7085708 [st=20071]    HTTP2_SESSION_PING
                        --> is_ack = true
                        --> type = "sent"
                        --> unique_id = 2
t=7085708 [st=20071]    HTTP2_SESSION_CLOSE
                        --> description = "Connection closed"
                        --> net_error = -100 (ERR_CONNECTION_CLOSED)
t=7085708 [st=20071]    HTTP2_SESSION_POOL_REMOVE_SESSION
t=7085708 [st=20071] -HTTP2_SESSION

With 9.0.21 the connection does not close, even after several minutes.
I believe the change in behavior stems the following commit: 
https://github.com/apache/tomcat/commit/c16d9d810a1f64cd768ff33058936cf8907e3117
 and so I may be doing something wrong.

Please let me know whether I have misconfigured, misunderstood, misdiagnosed, 
misbehaved or mis-something-else, and whether I should provide additional 
information

Current setup of the production servers:
AdoptOpenJDK (build 11.0.3+7) 
Amazon Linux 2

<Connector port="443" protocol="org.apache.coyote.http11.Http11NioProtocol"
                   maxHttpHeaderSize="16384"
                   maxThreads="500" minSpareThreads="25"
                   enableLookups="false" disableUploadTimeout="true"
                   connectionTimeout="10000"
                   compression="on"
                   SSLEnabled="true" scheme="https" secure="true">
            <UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol"
                             keepAliveTimeout="20000"/>
            <SSLHostConfig protocols="+TLSv1.2+TLSv1.3">
                <Certificate certificateKeystoreFile="tomcat.keystore"
                             certificateKeyAlias="tomcat"
                             certificateKeystorePassword=""
                             certificateKeystoreType="PKCS12"/>
            </SSLHostConfig>
</Connector>

Thanks
Chen

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to