Am 17.11.2019 um 17:21 schrieb Konstantin Kolinko:
вс, 17 нояб. 2019 г. в 18:40, Rainer Jung <rainer.j...@kippdata.de>:

I noticed a slowness during Webapp startup on the Winows platform.
Investigation showed, that about three seconds of the webapp startup
time - roughly 1/3 of the total time - was due to
File.getCanonicalPath(). That call seems to be especially expensive on
Windows, because it iterates over each component of the path in question
and does at least three system calls on each of them. So if you app is
installed in a somewhat deeply nested path, getCanonicalPath() gets
especially expensive.


Just as a reminder:
getCanonicalPath() is necessary to deal with file system on Windows
being case-insensitive. (Removing those calls is likely to lead to
exploitable security issues.) On case-sensitive filesystems one can
configure <Resources allowLinking="true"/> in Tomcat 8.5/9.0 and some
of those checks will be disabled.

http://tomcat.apache.org/tomcat-8.5-doc/config/resources.html

In Tomcat 7.0 the same configuration flag exists on Context,

http://tomcat.apache.org/tomcat-7.0-doc/config/context.html

Yes, setting allowLinking="true" fixes the performance impact but is not a real option on Windows, where the problem was observed. It wasn't observed on Linux, I guess the performance penalty of getCanonicalPath() on that platform is smaller (but didn't check).

One can check this with a simple test app and running the ProcessMonitpr
from SysInternals, which can nicely trace file system operations.

I noticed that by far the most occurrences happened during webapp class
loading. StandardRoot initializes resources to contain a DirResourceSet
first and then a JarResourceSet per WEB-INF/lib jar file.

Now any class loading (of not yet loaded classes) goes through the
following stack (this is for TC 8.5):

...
java.base@13.0.1/java.io.File.getCanonicalPath(File.java:618)
org.apache.catalina.webresources.AbstractFileResourceSet.file(AbstractFileResourceSet.java:90)
org.apache.catalina.webresources.DirResourceSet.getResource(DirResourceSet.java:101)
org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281)
org.apache.catalina.webresources.Cache.getResource(Cache.java:62)
org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216)
org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225)
org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2279)
org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:857)
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329)
- locked (a java.lang.Object@1797c4d) index 11 frame
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329)
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1182)
...

In our case, the webapp does not have a WEB-INF/classes directory.
Nevertheless TC always goes through the DirResourceSet running the
expensive getCanonicalPath(), returning an EmptyResource and then
looking for the class and finding it in one of the jar file resources.
This happens about 5000 times (due to the number of classes to load).

I know that we have to do this in general, because WEB-INF/classes have
precedence over WEB-INF/lib, but maybe we could optimize in the case we
are looking for a class but WEB-INF/classes does not even exist? Would
this be a reasonable general optimization?

There are two scenarios:
(a) changes are expected: the web application classes can be updated
at runtime (with new classes being added),
(b) no changes: the web application classes are not updated at runtime.

(b) is certainly the case when the application runs in production mode
(when JSPs cannot be updated either)

(a) may be useful in some scenarios. Maybe in production if some
classes are added to a web application at run time as plugins.

I think that (b) is the case even when running in development mode, as
we are talking about classes, not JSPs here. IIRC Eclipse IDE restarts
a web application when its classes are updated and recompiled by the
IDE. I think that (a) is not used in development.
What do others think about it?
I remember some questions about hot-swapping of classes (for sake of
debugging), but Tomcat does not have such feature yet.

I think we can assume (b) by default, but have a flag to specify that
it is really (a).

If it is the (b) case, a single call to File.exists() /
File.isDirectory() is sufficient to remember "the classes directory
does not exist" and stop further lookups.

+1

If if is the (a) case, a call to call File.exists() /
File.isDirectory() can be added as an optimization before calling
getCanonicalPath().

+1

I think that we cannot remember "the classes directory does not exist"
fact without having a flag to disable such feature.

+1

Looking into AbstractFileResourceSet.file() from the above stacktrace,
some check are already there, e.g. "mustExist" flag. I see that
DirResourceSet.getResource(DirResourceSet.java:101) calls file(...,
false), so "mustExist" is false. It is just a first glance, I have not
dug further yet.

This is correct. A question is, whether all of the checks and normalizations in the default DirResourceSet are need for a resource with path starting with WEB-INF/classes, if WEB-INF/classes does not exist. AbstractFileResourceSet in my case does return a non-null File object after the (expensive) checks, which then is wrapped by DirResourceSet into an EmptyResource because the file does not exist.

Later is is found in the jar files, so the EmptyResource isn't used.

If the class does not get found in one of the jars, one would probably then have to create an appropriate EmptyResource to return.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to