Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-10-02 Thread Mark Thomas
On 02/10/2019 01:28, Chen Levy wrote:
>> -Original Message-
>> From: Mark Thomas 
>> Sent: Tuesday, October 1, 2019 17:43
>> To: users@tomcat.apache.org
>> Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak
>>
>> Found it.
>>
>> HTTP/2 on NIO is affected.
>> HTTP/2 on APR/native is not affected.
>>
>> Need to check on NIO2 but I suspect it is affected.
>>
>> Patch to follow shortly.
>>
>> Mark
> 
> 
> Good, here's some more corroborating info:
> Mark I followed your suggestion to test without HTTP/2, and one of my servers 
> (v9.0.26) has been running without it for a day now, showing no memory 
> accumulation
> I do not use APR/Native

This has been fixed and the fix will be included in 9.0.27 onwards.

8.5.x was not affected.

NIO2 was affected.

You should also be able to avoid the memory leak with NIO by setting
useAsyncIO="false" on the Connector.

There isn't an easy way to avoid it with NIO2. For those users using
NIO2, I'd recommend switching to NIO was a workaround until 9.0.27 is
released.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-10-01 Thread Chen Levy
> -Original Message-
> From: Mark Thomas 
> Sent: Tuesday, October 1, 2019 17:43
> To: users@tomcat.apache.org
> Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak
> 
> Found it.
> 
> HTTP/2 on NIO is affected.
> HTTP/2 on APR/native is not affected.
> 
> Need to check on NIO2 but I suspect it is affected.
> 
> Patch to follow shortly.
> 
> Mark


Good, here's some more corroborating info:
Mark I followed your suggestion to test without HTTP/2, and one of my servers 
(v9.0.26) has been running without it for a day now, showing no memory 
accumulation
I do not use APR/Native

Chen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-10-01 Thread Mark Thomas
Found it.

HTTP/2 on NIO is affected.
HTTP/2 on APR/native is not affected.

Need to check on NIO2 but I suspect it is affected.

Patch to follow shortly.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-10-01 Thread Mark Thomas
On 30/09/2019 14:12, Rémy Maucherat wrote:



> I added debug code in
> AbstractProtocol.ConnectionHandler.release(SocketWrapperBase) to check
> if the processor considered was present in the waitingProcessors map. The
> result is the following:
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@77b16580
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@1d902704
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@610c4fc8
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@1a3a3cb6
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@336f552d
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@3cd94f25
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@66e24762
> TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
> PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@7c7a1c3c
> TEST-org.apache.coyote.http11.TestHttp11Processor.NIO.txt:CHECK PROCESSOR
> FAILED org.apache.coyote.http11.Http11Processor@55a44822
> TEST-org.apache.coyote.http11.upgrade.TestUpgradeInternalHandler.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@6e55ff60
> TEST-org.apache.coyote.http11.upgrade.TestUpgrade.NIO.txt:CHECK PROCESSOR
> FAILED org.apache.coyote.http11.upgrade.UpgradeProcessorExternal@37d98b7f
> TEST-org.apache.tomcat.websocket.server.TestShutdown.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@6be9bd85
> TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@3bd4e02f
> TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@4bb23a77
> TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@32e20d65
> TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
> PROCESSOR FAILED
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@16abf52f
> 
> All instances of not removed processors are either from async or upgraded
> processors (the internal kind), as expected. I have verified the processor
> instances above are never removed so it might be more robust to simply call
> proto.removeWaitingProcessor(processor); in
> AbstractProtocol.ConnectionHandler.release(SocketWrapperBase) (after all
> the socket is closed and done after that point). There could be a more fine
> grained solution of course.
> 
> However, this does not match the leak scenario described by the user, this
> doesn't happen without async or websockets being used.

I'm not sure those are leaks. I've started to check them and it looks
like Tomcat is shutting down while an async request is still waiting to
timeout. In those circumstances you would expect to see a Processor in
waiting processors.

A separate question is what is the correct error handling for async
requests. There was some discussion on that topic on the Jakarta Servlet
list but it didn't reach any definitive conclusions. I have some patches
I need to get back to that should help but they are still a work in
progress.

I'll keep checking but my sense is that we haven't found the root cause
of this leak yet.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-30 Thread Rémy Maucherat
On Sat, Sep 28, 2019 at 9:05 PM Mark Thomas  wrote:

> On 27/09/2019 22:39, Chen Levy wrote:
> > -Original Message-
> > From: Mark Thomas 
> > Sent: Friday, September 27, 2019 15:34
> > To: users@tomcat.apache.org
> > Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak
> >
> > On 27/09/2019 16:34, Chen Levy wrote:
> >> On 26/09/2019 18:22, Chen Levy wrote:
> >
> > 
> >
> >>> The HashMap referenced in the report appears to be "waitingProcessors"
> inside AbstractProtocol which contain 262K entries.
> >>
> >> OK. Those are asynchronous Servlets that are still in async mode.
> >
> > 
> >
> >> * I do not employ async servlets in my application
> >
> > OK. Do you use WebSocket? There is a code path to add Processors to the
> waitingProcessors Map for WebSocket as well.
> >
> > Mark
> >
> >
> > No, no WebSocket either; just plain old Servlets, Filters and the
> occasional JSP
>
> OK. That narrows down where/how this might be happening.
>
> What about if you disable HTTP/2. Do you still see the issue then?
>

I added debug code in
AbstractProtocol.ConnectionHandler.release(SocketWrapperBase) to check
if the processor considered was present in the waitingProcessors map. The
result is the following:
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@77b16580
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@1d902704
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@610c4fc8
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@1a3a3cb6
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@336f552d
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@3cd94f25
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@66e24762
TEST-javax.servlet.http.TestHttpServletResponseSendError.NIO.txt:CHECK
PROCESSOR FAILED org.apache.coyote.http11.Http11Processor@7c7a1c3c
TEST-org.apache.coyote.http11.TestHttp11Processor.NIO.txt:CHECK PROCESSOR
FAILED org.apache.coyote.http11.Http11Processor@55a44822
TEST-org.apache.coyote.http11.upgrade.TestUpgradeInternalHandler.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@6e55ff60
TEST-org.apache.coyote.http11.upgrade.TestUpgrade.NIO.txt:CHECK PROCESSOR
FAILED org.apache.coyote.http11.upgrade.UpgradeProcessorExternal@37d98b7f
TEST-org.apache.tomcat.websocket.server.TestShutdown.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@6be9bd85
TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@3bd4e02f
TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@4bb23a77
TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@32e20d65
TEST-org.apache.tomcat.websocket.TestWsRemoteEndpoint.NIO.txt:CHECK
PROCESSOR FAILED
org.apache.coyote.http11.upgrade.UpgradeProcessorInternal@16abf52f

All instances of not removed processors are either from async or upgraded
processors (the internal kind), as expected. I have verified the processor
instances above are never removed so it might be more robust to simply call
proto.removeWaitingProcessor(processor); in
AbstractProtocol.ConnectionHandler.release(SocketWrapperBase) (after all
the socket is closed and done after that point). There could be a more fine
grained solution of course.

However, this does not match the leak scenario described by the user, this
doesn't happen without async or websockets being used.

Rémy


Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-28 Thread Mark Thomas
On 27/09/2019 22:39, Chen Levy wrote:
> -Original Message-
> From: Mark Thomas  
> Sent: Friday, September 27, 2019 15:34
> To: users@tomcat.apache.org
> Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak
> 
> On 27/09/2019 16:34, Chen Levy wrote:
>> On 26/09/2019 18:22, Chen Levy wrote:
> 
> 
> 
>>> The HashMap referenced in the report appears to be "waitingProcessors" 
>>> inside AbstractProtocol which contain 262K entries.
>>
>> OK. Those are asynchronous Servlets that are still in async mode.
> 
> 
> 
>> * I do not employ async servlets in my application
> 
> OK. Do you use WebSocket? There is a code path to add Processors to the 
> waitingProcessors Map for WebSocket as well.
> 
> Mark
> 
> 
> No, no WebSocket either; just plain old Servlets, Filters and the occasional 
> JSP

OK. That narrows down where/how this might be happening.

What about if you disable HTTP/2. Do you still see the issue then?

Thanks,

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-27 Thread Chen Levy
-Original Message-
From: Mark Thomas  
Sent: Friday, September 27, 2019 15:34
To: users@tomcat.apache.org
Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak

On 27/09/2019 16:34, Chen Levy wrote:
> On 26/09/2019 18:22, Chen Levy wrote:



>> The HashMap referenced in the report appears to be "waitingProcessors" 
>> inside AbstractProtocol which contain 262K entries.
> 
> OK. Those are asynchronous Servlets that are still in async mode.



> * I do not employ async servlets in my application

OK. Do you use WebSocket? There is a code path to add Processors to the 
waitingProcessors Map for WebSocket as well.

Mark


No, no WebSocket either; just plain old Servlets, Filters and the occasional JSP

Chen
-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-27 Thread Mark Thomas
On 27/09/2019 16:34, Chen Levy wrote:
> On 26/09/2019 18:22, Chen Levy wrote:



>> The HashMap referenced in the report appears to be "waitingProcessors" 
>> inside AbstractProtocol which contain 262K entries.
> 
> OK. Those are asynchronous Servlets that are still in async mode.



> * I do not employ async servlets in my application

OK. Do you use WebSocket? There is a code path to add Processors to the
waitingProcessors Map for WebSocket as well.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-27 Thread Chen Levy


-Original Message-
From: Mark Thomas  
Sent: Thursday, September 26, 2019 15:50
To: users@tomcat.apache.org
Subject: Re: Tomcat 9.0.24/9.0.26 suspected memory leak

On 26/09/2019 18:22, Chen Levy wrote:
> Hello Experts
> 
> Several of my production servers were recently upgraded from Tomcat 9.0.14 to 
> 9.0.24; immediately after the upgrade the servers started accumulating memory 
> in a steady trend that was not observed before. In addition, CPU utilization 
> that used to hover around 2% not sits at 8%.
> For now the servers are still serving but I suspect they'll become 
> unresponsive in a few hours.
> I loaded a heap dump from one of the servers into MAT and received the 
> following Leak Suspect:
> 
> One instance of "org.apache.coyote.http11.Http11NioProtocol" loaded by 
> "java.net.URLClassLoader @ 0x503f02c40" occupies 9,282,972,608 (96.88%) 
> bytes. The memory is accumulated in one instance of 
> "java.util.concurrent.ConcurrentHashMap$Node[]" loaded by " loader>".
> 
> The HashMap referenced in the report appears to be "waitingProcessors" inside 
> AbstractProtocol which contain 262K entries.

OK. Those are asynchronous Servlets that are still in async mode.

While it is possible for an application to deliberately get itself into a state 
like this (infinite async timeouts and don't complete/dispatch the async 
requests) given that it doesn't happen with 9.0.14 but does with 9.0.24 (and 
.26) that suggests a Tomcat bug.

> The same issue was reproduced using v9.0.26 as well
> 
> Please let me know whether I should provide additional information

Can you do a binary search to determine which Tomcat 9.0.x release this problem 
was introduced in?

How easily can you reproduce this? Do you have something approaching a test 
case we could use to repeat the issue?

Meanwhile, I'll take a look at the changelog and see if anything jumps out as a 
possible cause.

Thanks,

Mark


> 
> Current setup of the production servers:
> AdoptOpenJDK (build 11.0.3+7)
> Amazon Linux 2
> 
> maxHttpHeaderSize="16384"
>maxThreads="500" minSpareThreads="25"
>enableLookups="false" disableUploadTimeout="true"
>connectionTimeout="1"
>compression="on"
>SSLEnabled="true" scheme="https" secure="true">
>keepAliveTimeout="2"
>  overheadDataThreadhold="0"/>
> 
>   certificateKeyAlias="tomcat"
>  certificateKeystorePassword=""
>  certificateKeystoreType="PKCS12"/>
> 
> 
> 
> Thanks
> Chen
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


Thanks for the attention Mark, here are some additional information and answers:
* Once the memory was completely consumed, the servers stopped responding with 
CPU stuck at 100%
* I do not employ async servlets in my application
* I cannot do a binary search for a version because of this change: 
https://github.com/apache/tomcat/commit/c16d9d810a1f64cd768ff33058936cf8907e3117
 which cause another memory leak and server failure between v9.0.16 and v9.0.21 
and was fixed in v9.0.24 (as far as I know)
* This is easily reproduced with the traffic in my farm and all the servers 
suffer the same. In a development environment it's more tricky; so currently I 
don't have a test case

Thanks
Chen


Re: Tomcat 9.0.24/9.0.26 suspected memory leak

2019-09-26 Thread Mark Thomas
On 26/09/2019 18:22, Chen Levy wrote:
> Hello Experts
> 
> Several of my production servers were recently upgraded from Tomcat 9.0.14 to 
> 9.0.24; immediately after the upgrade the servers started accumulating memory 
> in a steady trend that was not observed before. In addition, CPU utilization 
> that used to hover around 2% not sits at 8%.
> For now the servers are still serving but I suspect they'll become 
> unresponsive in a few hours.
> I loaded a heap dump from one of the servers into MAT and received the 
> following Leak Suspect:
> 
> One instance of "org.apache.coyote.http11.Http11NioProtocol" loaded by 
> "java.net.URLClassLoader @ 0x503f02c40" occupies 9,282,972,608 (96.88%) 
> bytes. The memory is accumulated in one instance of 
> "java.util.concurrent.ConcurrentHashMap$Node[]" loaded by " loader>".
> 
> The HashMap referenced in the report appears to be "waitingProcessors" inside 
> AbstractProtocol which contain 262K entries.

OK. Those are asynchronous Servlets that are still in async mode.

While it is possible for an application to deliberately get itself into
a state like this (infinite async timeouts and don't complete/dispatch
the async requests) given that it doesn't happen with 9.0.14 but does
with 9.0.24 (and .26) that suggests a Tomcat bug.

> The same issue was reproduced using v9.0.26 as well
> 
> Please let me know whether I should provide additional information

Can you do a binary search to determine which Tomcat 9.0.x release this
problem was introduced in?

How easily can you reproduce this? Do you have something approaching a
test case we could use to repeat the issue?

Meanwhile, I'll take a look at the changelog and see if anything jumps
out as a possible cause.

Thanks,

Mark


> 
> Current setup of the production servers:
> AdoptOpenJDK (build 11.0.3+7) 
> Amazon Linux 2
> 
> maxHttpHeaderSize="16384"
>maxThreads="500" minSpareThreads="25"
>enableLookups="false" disableUploadTimeout="true"
>connectionTimeout="1"
>compression="on"
>SSLEnabled="true" scheme="https" secure="true">
>keepAliveTimeout="2"
>  overheadDataThreadhold="0"/>
> 
>   certificateKeyAlias="tomcat"
>  certificateKeystorePassword=""
>  certificateKeystoreType="PKCS12"/>
> 
> 
> 
> Thanks
> Chen
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org