Daniel,

I cannot access your flame graph on imgur, but what is happening in your code 
that leads to the jar scanning?  All of my apps have run on Linux since 
forever, so I don’t know what might be different with Windows, but I’ve found 
that anything that uses the Java service loader should be treated with care due 
to jar/zip scanning and synchronization that it does.  Many factories use the 
service loader to look up an implementation, but often factories are intended 
to be created once and reused.  However I often see developers creating them 
over and over, which leads to high CPU usage.  When we attach a profiler, we 
see threads spending a lot of time scanning jars.

There were some changes somewhere between Tomcat 7 and 9 that affected how 
resources were cached.  It was especially noticeable (worse in 9) for JAXB 
because JAXB was no longer included in the JDK and was therefore no longer 
loaded from the system classloader.

Unfortunately I don’t have anything in my notes about how large the caches are 
or when they might be flushed.

Thanks


From: Daniel Sheridan <daniel.sheri...@progress.com.INVALID>
Sent: Friday, July 11, 2025 10:57 AM
To: Tomcat Users List <users@tomcat.apache.org>
Subject: RE: Classloading has a long delay after idle period

​͏​>On 7/8/25 16: 32, Christopher Schultz wrote: >Daniel, > >On 7/8/25 11: 15 
AM, Daniel Sheridan wrote: >> On 7/2/25 10: 22 AM, Daniel Sheridan wrote: >>> 
Hi folks, >>> >>> We're using Tomcat 10. 1. 40, ZjQcmQRYFpfptPreheaderEnd


>On 7/8/25 16:32, Christopher Schultz wrote:

>Daniel,

>

>On 7/8/25 11:15 AM, Daniel Sheridan wrote:

>> On 7/2/25 10:22 AM, Daniel Sheridan wrote:

>>> Hi folks,

>>>

>>> We're using Tomcat 10.1.40, but also seeing this issue with multiple Tomcat 
>>> 9 versions, running on Windows Server 2019 and Server 2022 machines. We're 
>>> hosting a web app with a REST API, and encounter delays on requests when 
>>> they hit our REST API of 9-12 seconds. Easiest way to reproduce it is 
>>> leaving the web page (which utilizes the REST API) idle for 10+ minutes and 
>>> then initiating a request, but have seen it happen rarely with shorter idle 
>>> periods as well.

>>>

>>> Have done extensive investigation into this, narrowed it down to a 
>>> classloading/jar scanning issue. Here's an excerpt from the Catalina logs 
>>> with loader logging set to ALL:

>>>

>>> 25-Jun-2025 15:00:10.475 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass 
>>> loadClass(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage,
>>>  false)

>>> 25-Jun-2025 15:00:10.477 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass Searching local 
>>> repositories

>>> 25-Jun-2025 15:00:10.478 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.findClass 
>>> findClass(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage)

>>> 25-Jun-2025 15:00:10.478 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.findClass 
>>> findClassInternal(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage)

>>> 25-Jun-2025 15:00:19.682 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.findClass Returning class 
>>> class 
>>> org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage

>>> 25-Jun-2025 15:00:19.682 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.findClass Loaded by 
>>> ParallelWebappClassLoader

>>> context: ROOT

>>> delegate: false

>>> ----------> Parent Classloader:

>>> mailto:java.net.URLClassLoader@3b94d659

>>> 25-Jun-2025 15:00:19.683 FINER [https-openssl-nio-443-exec-10] 
>>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass Loading class 
>>> from local repository

>>>

>>> As you can see, the majority of the delay is during the findClassInternal 
>>> call. I inspected the Tomcat process behaviour with ProcMon, and I can see 
>>> that the delay happens during the jar scan for the 
>>> AbstractMessageConverterMethodArgumentResolver class; we have 140+ 
>>> dependency jars and it's reading through most of them until it find the 
>>> class; the scanning takes much longer than regular jar scans for some 
>>> reason. Mostly, the delay is between QueryEAFile and 
>>> QueryStandardInformationFile operations, of which there are 1 each per jar 
>>> scanned (though the operations themselves have a reasonable duration, there 
>>> is a delay there with no file operations occurring).

>>>

>>> AbstractMessageConverterMethodArgumentResolver IS loaded in when the 
>>> application is started, so it isn't the first time it's been loaded in. 
>>> Note that AbstractMessageConverterMethodArgumentResolver is not the only 
>>> class I've seen this happen with, it doesn't appear to be related to 
>>> specific classes/jars, I've seen the class that the jar scan is initiated 
>>> for be different depending on the endpoint the request is sent to.

>>>

>>> JVM classloading/unloading and GC logging doesn't reveal any 
>>> cleanup/unloading happening that might necessitate the new classloading. I 
>>> am not having this issue occur when I run our web app on a non-Tomcat web 
>>> server, which makes me think the issue is Tomcat related.

>>>

>>> My main question is, what inside Tomcat would force the classloading if 
>>> there isn't any sign of the classes being unloaded beforehand? Is there 
>>> some kind of periodic refresh of idle/'cold' classes I'm missing?

>>>

>>> Also, would there be any suggestions about why the jar scan in this case 
>>> takes so long compared to the other regular jar scans Tomcat does? I 
>>> understand this might have other causes (like on the I/O side), but maybe 
>>> there's something relevant here on the Tomcat side.

>>

>> Is there any chance you could produce a Java + native flame graph of this 
>> behavior? It will help hone in on exactly where the problem is.

>>

>> It could be some poorly-performing locks in Tomcat or the JVM. Or it could 
>> be poorly-sized caches that need to be refilled with an expensive process. 
>> Or it could be a failing disk. A flame graph will likely pinpoint that.

>>

>> -chris

>>

>>

>> ---------------------------------------------------------------------

>> To unsubscribe, e-mail: mailto:users-unsubscr...@tomcat.apache.org

>> For additional commands, e-mail: mailto:users-h...@tomcat.apache.org

>>

>> I've put together a Java flame graph that covers slightly before,

>> during, and slightly after the delay using JDK Mission Control and a

>> JFR recording, see here: 
>> https://urldefense.com/v3/__https://imgur.com/a/7cTmSBu__;!!F9svGWnIaVPGSwU!qe6EqeEd3-02RRjMZSDDk4oqXPaxkrAW2NH5L2UG6J7yQk27otfl5AuvvSezJ1SnuuZhH5x0-v7qrCHLb7ty7h7obSy1e1XdI1znWdtmCQ$<https://urldefense.com/v3/__https:/imgur.com/a/7cTmSBu__;!!F9svGWnIaVPGSwU!qe6EqeEd3-02RRjMZSDDk4oqXPaxkrAW2NH5L2UG6J7yQk27otfl5AuvvSezJ1SnuuZhH5x0-v7qrCHLb7ty7h7obSy1e1XdI1znWdtmCQ$>.
>>  The first

>> image is the entire flame graph, the second is zoomed in on the

>> latter portion that contains the jar scan method calls.

>> Unfortunately, our web app is Windows only, and from what I

>> researched it didn't look like there was a good method on Windows to

>> generate a Java + native flame graph (I did try PrefView but the

>> resulting flame graph was fairly useless; if you need more info and

>> have suggestions on how to get a more detailed flame graph do let me

>> know).

>

>If I'm reading this correctly, then the issue is mostly with the delay

>associated with RandomAccessFile.open().

>

>There isn't much Tomcat can do about that, except maybe to reduce the

>number of files that are opened somehow. Are you using expanded-WAR

>deployment or non-expanded WAR deployment? How many JAR files do you

>have in WEB-INF/classes? How "big" are they (usually file size is less

>relevant than directory-size)?

>

>> We can rule out the issue being a failing disk, as I've reproduced

>> this on multiple different machines with no disk issues.

>Hmm. The flame graph points to either a slow disk or an unusually huge

>number of file-opens. Is this a physical server or virtual?

>Network-attached or local storage?

>

>Have you changed any default configuration for things like class

>loading, resource-watches, background-processing intervals, etc.? Are

>you running in development (check files for updates) or production

>(ignore on-disk artifact changes) mode?

>

>-chris

>

>

>---------------------------------------------------------------------

>To unsubscribe, e-mail: mailto:users-unsubscr...@tomcat.apache.org

>For additional commands, e-mail: mailto:users-h...@tomcat.apache.org



Correct, almost the entire delay is during the JAR scanning when the files

are being accessed.



We are using expanded-WAR deployment.



We have no JAR files in WEB-INF/classes, just class files for our proprietary

code. We have 160 jar files in WEB-INF/lib, mostly consisting of 3rd party

dependencies; it's this directory that's being scanned through during the

delay.



The size of the JAR files varies quite a bit, as big as 11MB to as small as

3 KB. Around 140 out of the 164 JARs are under 1MB in size, around 70 of

those 140 are under 150KB.



I've been running the web app on virtual machines, which I would assume use

network-attached storage.



>The flame graph points to either a slow disk or an unusually huge

>number of file-opens

>From my previous ProcMon investigating, during the JAR scanning around 140

files are opened, in the case of loading the EmptyBodyCheckingHttpInputMessage

class. The delay happens once after the idle period, then doesn't happen

afterwards even if further JAR scanning of a similar amount of JARs is done

when a different class is being loaded.



We haven't changed any Tomcat defaults around anything that would be related;

I did mess around with some resource caching settings to see if it'd make

any difference, but it did not.



We're not running in development mode.



If I idle the virtual machine and do other file related tasks, I don't notice

a long delay like we see here, which makes me think it isn't an issue with

the disk being slow after an idle; however, maybe I'm missing something

there, and that is the case.



The other side of it is, why is Tomcat doing the classload when the classes

don't seem to be unloaded at any point? I don't understand why the

classload is needed at all at this point.



- Dan

Reply via email to