On 7/2/25 10:22 AM, Daniel Sheridan wrote:
> Hi folks,
> 
> We're using Tomcat 10.1.40, but also seeing this issue with multiple Tomcat 9 
> versions, running on Windows Server 2019 and Server 2022 machines. We're 
> hosting a web app with a REST API, and encounter delays on requests when they 
> hit our REST API of 9-12 seconds. Easiest way to reproduce it is leaving the 
> web page (which utilizes the REST API) idle for 10+ minutes and then 
> initiating a request, but have seen it happen rarely with shorter idle 
> periods as well.
> 
> Have done extensive investigation into this, narrowed it down to a 
> classloading/jar scanning issue. Here's an excerpt from the Catalina logs 
> with loader logging set to ALL:
> 
> 25-Jun-2025 15:00:10.475 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass 
> loadClass(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage,
>  false)
> 25-Jun-2025 15:00:10.477 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass   Searching local 
> repositories
> 25-Jun-2025 15:00:10.478 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.findClass     
> findClass(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage)
> 25-Jun-2025 15:00:10.478 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.findClass       
> findClassInternal(org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage)
> 25-Jun-2025 15:00:19.682 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.findClass       Returning 
> class class 
> org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodArgumentResolver$EmptyBodyCheckingHttpInputMessage
> 25-Jun-2025 15:00:19.682 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.findClass       Loaded by 
> ParallelWebappClassLoader
>    context: ROOT
>    delegate: false
> ----------> Parent Classloader:
> mailto:java.net.URLClassLoader@3b94d659
> 25-Jun-2025 15:00:19.683 FINER [https-openssl-nio-443-exec-10] 
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass   Loading class 
> from local repository
> 
> As you can see, the majority of the delay is during the findClassInternal 
> call. I inspected the Tomcat process behaviour with ProcMon, and I can see 
> that the delay happens during the jar scan for the 
> AbstractMessageConverterMethodArgumentResolver class; we have 140+ dependency 
> jars and it's reading through most of them until it find the class; the 
> scanning takes much longer than regular jar scans for some reason. Mostly, 
> the delay is between QueryEAFile and QueryStandardInformationFile operations, 
> of which there are 1 each per jar scanned (though the operations themselves 
> have a reasonable duration, there is a delay there with no file operations 
> occurring).
> 
> AbstractMessageConverterMethodArgumentResolver IS loaded in when the 
> application is started, so it isn't the first time it's been loaded in. Note 
> that AbstractMessageConverterMethodArgumentResolver is not the only class 
> I've seen this happen with, it doesn't appear to be related to specific 
> classes/jars, I've seen the class that the jar scan is initiated for be 
> different depending on the endpoint the request is sent to.
> 
> JVM classloading/unloading and GC logging doesn't reveal any 
> cleanup/unloading happening that might necessitate the new classloading. I am 
> not having this issue occur when I run our web app on a non-Tomcat web 
> server, which makes me think the issue is Tomcat related.
> 
> My main question is, what inside Tomcat would force the classloading if there 
> isn't any sign of the classes being unloaded beforehand? Is there some kind 
> of periodic refresh of idle/'cold' classes I'm missing?
> 
> Also, would there be any suggestions about why the jar scan in this case 
> takes so long compared to the other regular jar scans Tomcat does? I 
> understand this might have other causes (like on the I/O side), but maybe 
> there's something relevant here on the Tomcat side.

Is there any chance you could produce a Java + native flame graph of this 
behavior? It will help hone in on exactly where the problem is.

It could be some poorly-performing locks in Tomcat or the JVM. Or it could be 
poorly-sized caches that need to be refilled with an expensive process. Or it 
could be a failing disk. A flame graph will likely pinpoint that.

-chris


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


I've put together a Java flame graph that covers slightly before, during, and 
slightly after the delay using JDK Mission Control and a JFR recording, see 
here: https://imgur.com/a/7cTmSBu. The first image is the entire flame graph, 
the second is zoomed in on the latter portion that contains the jar scan method 
calls. Unfortunately, our web app is Windows only, and from what I researched 
it didn't look like there was a good method on Windows to generate a Java + 
native flame graph (I did try PrefView but the resulting flame graph was fairly 
useless; if you need more info and have suggestions on how to get a more 
detailed flame graph do let me know). 

We can rule out the issue being a failing disk, as I've reproduced this on 
multiple different machines with no disk issues.

- Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to