Re: Possible optimization for class loading?
On 17/11/2019 16:34, Rainer Jung wrote: > Am 17.11.2019 um 17:21 schrieb Konstantin Kolinko: >> Looking into AbstractFileResourceSet.file() from the above stacktrace, >> some check are already there, e.g. "mustExist" flag. I see that >> DirResourceSet.getResource(DirResourceSet.java:101) calls file(..., >> false), so "mustExist" is false. It is just a first glance, I have not >> dug further yet. > > This is correct. A question is, whether all of the checks and > normalizations in the default DirResourceSet are need for a resource > with path starting with WEB-INF/classes, if WEB-INF/classes does not > exist. AbstractFileResourceSet in my case does return a non-null File > object after the (expensive) checks, which then is wrapped by > DirResourceSet into an EmptyResource because the file does not exist. > > Later is is found in the jar files, so the EmptyResource isn't used. > > If the class does not get found in one of the jars, one would probably > then have to create an appropriate EmptyResource to return. I think there is scope for some optimisation here. I only took a quick look but I couldn't see a way to do it without some changes to the API (I'm fine with changes but was hoping to be able to do it without). Mark - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org
Re: Possible optimization for class loading?
Am 17.11.2019 um 17:21 schrieb Konstantin Kolinko: вс, 17 нояб. 2019 г. в 18:40, Rainer Jung : I noticed a slowness during Webapp startup on the Winows platform. Investigation showed, that about three seconds of the webapp startup time - roughly 1/3 of the total time - was due to File.getCanonicalPath(). That call seems to be especially expensive on Windows, because it iterates over each component of the path in question and does at least three system calls on each of them. So if you app is installed in a somewhat deeply nested path, getCanonicalPath() gets especially expensive. Just as a reminder: getCanonicalPath() is necessary to deal with file system on Windows being case-insensitive. (Removing those calls is likely to lead to exploitable security issues.) On case-sensitive filesystems one can configure in Tomcat 8.5/9.0 and some of those checks will be disabled. http://tomcat.apache.org/tomcat-8.5-doc/config/resources.html In Tomcat 7.0 the same configuration flag exists on Context, http://tomcat.apache.org/tomcat-7.0-doc/config/context.html Yes, setting allowLinking="true" fixes the performance impact but is not a real option on Windows, where the problem was observed. It wasn't observed on Linux, I guess the performance penalty of getCanonicalPath() on that platform is smaller (but didn't check). One can check this with a simple test app and running the ProcessMonitpr from SysInternals, which can nicely trace file system operations. I noticed that by far the most occurrences happened during webapp class loading. StandardRoot initializes resources to contain a DirResourceSet first and then a JarResourceSet per WEB-INF/lib jar file. Now any class loading (of not yet loaded classes) goes through the following stack (this is for TC 8.5): ... java.base@13.0.1/java.io.File.getCanonicalPath(File.java:618) org.apache.catalina.webresources.AbstractFileResourceSet.file(AbstractFileResourceSet.java:90) org.apache.catalina.webresources.DirResourceSet.getResource(DirResourceSet.java:101) org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281) org.apache.catalina.webresources.Cache.getResource(Cache.java:62) org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216) org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225) org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2279) org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:857) org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) - locked (a java.lang.Object@1797c4d) index 11 frame org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1182) ... In our case, the webapp does not have a WEB-INF/classes directory. Nevertheless TC always goes through the DirResourceSet running the expensive getCanonicalPath(), returning an EmptyResource and then looking for the class and finding it in one of the jar file resources. This happens about 5000 times (due to the number of classes to load). I know that we have to do this in general, because WEB-INF/classes have precedence over WEB-INF/lib, but maybe we could optimize in the case we are looking for a class but WEB-INF/classes does not even exist? Would this be a reasonable general optimization? There are two scenarios: (a) changes are expected: the web application classes can be updated at runtime (with new classes being added), (b) no changes: the web application classes are not updated at runtime. (b) is certainly the case when the application runs in production mode (when JSPs cannot be updated either) (a) may be useful in some scenarios. Maybe in production if some classes are added to a web application at run time as plugins. I think that (b) is the case even when running in development mode, as we are talking about classes, not JSPs here. IIRC Eclipse IDE restarts a web application when its classes are updated and recompiled by the IDE. I think that (a) is not used in development. What do others think about it? I remember some questions about hot-swapping of classes (for sake of debugging), but Tomcat does not have such feature yet. I think we can assume (b) by default, but have a flag to specify that it is really (a). If it is the (b) case, a single call to File.exists() / File.isDirectory() is sufficient to remember "the classes directory does not exist" and stop further lookups. +1 If if is the (a) case, a call to call File.exists() / File.isDirectory() can be added as an optimization before calling getCanonicalPath(). +1 I think that we cannot remember "the classes directory does not exist" fact without having a flag to disable such feature. +1 Looking into AbstractFileResourceSet.file() from the above stacktrace, some check are already there,
Re: Possible optimization for class loading?
вс, 17 нояб. 2019 г. в 18:40, Rainer Jung : > > I noticed a slowness during Webapp startup on the Winows platform. > Investigation showed, that about three seconds of the webapp startup > time - roughly 1/3 of the total time - was due to > File.getCanonicalPath(). That call seems to be especially expensive on > Windows, because it iterates over each component of the path in question > and does at least three system calls on each of them. So if you app is > installed in a somewhat deeply nested path, getCanonicalPath() gets > especially expensive. > Just as a reminder: getCanonicalPath() is necessary to deal with file system on Windows being case-insensitive. (Removing those calls is likely to lead to exploitable security issues.) On case-sensitive filesystems one can configure in Tomcat 8.5/9.0 and some of those checks will be disabled. http://tomcat.apache.org/tomcat-8.5-doc/config/resources.html In Tomcat 7.0 the same configuration flag exists on Context, http://tomcat.apache.org/tomcat-7.0-doc/config/context.html > One can check this with a simple test app and running the ProcessMonitpr > from SysInternals, which can nicely trace file system operations. > > I noticed that by far the most occurrences happened during webapp class > loading. StandardRoot initializes resources to contain a DirResourceSet > first and then a JarResourceSet per WEB-INF/lib jar file. > > Now any class loading (of not yet loaded classes) goes through the > following stack (this is for TC 8.5): > > ... > java.base@13.0.1/java.io.File.getCanonicalPath(File.java:618) > org.apache.catalina.webresources.AbstractFileResourceSet.file(AbstractFileResourceSet.java:90) > org.apache.catalina.webresources.DirResourceSet.getResource(DirResourceSet.java:101) > org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281) > org.apache.catalina.webresources.Cache.getResource(Cache.java:62) > org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216) > org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225) > org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2279) > org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:857) > org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) > - locked (a java.lang.Object@1797c4d) index 11 frame > org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) > org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1182) > ... > > In our case, the webapp does not have a WEB-INF/classes directory. > Nevertheless TC always goes through the DirResourceSet running the > expensive getCanonicalPath(), returning an EmptyResource and then > looking for the class and finding it in one of the jar file resources. > This happens about 5000 times (due to the number of classes to load). > > I know that we have to do this in general, because WEB-INF/classes have > precedence over WEB-INF/lib, but maybe we could optimize in the case we > are looking for a class but WEB-INF/classes does not even exist? Would > this be a reasonable general optimization? There are two scenarios: (a) changes are expected: the web application classes can be updated at runtime (with new classes being added), (b) no changes: the web application classes are not updated at runtime. (b) is certainly the case when the application runs in production mode (when JSPs cannot be updated either) (a) may be useful in some scenarios. Maybe in production if some classes are added to a web application at run time as plugins. I think that (b) is the case even when running in development mode, as we are talking about classes, not JSPs here. IIRC Eclipse IDE restarts a web application when its classes are updated and recompiled by the IDE. I think that (a) is not used in development. What do others think about it? I remember some questions about hot-swapping of classes (for sake of debugging), but Tomcat does not have such feature yet. I think we can assume (b) by default, but have a flag to specify that it is really (a). If it is the (b) case, a single call to File.exists() / File.isDirectory() is sufficient to remember "the classes directory does not exist" and stop further lookups. If if is the (a) case, a call to call File.exists() / File.isDirectory() can be added as an optimization before calling getCanonicalPath(). I think that we cannot remember "the classes directory does not exist" fact without having a flag to disable such feature. Looking into AbstractFileResourceSet.file() from the above stacktrace, some check are already there, e.g. "mustExist" flag. I see that DirResourceSet.getResource(DirResourceSet.java:101) calls file(..., false), so "mustExist" is false. It is just a first glance, I have not dug further yet. Best regards, Konstantin Kolinko
Possible optimization for class loading?
I noticed a slowness during Webapp startup on the Winows platform. Investigation showed, that about three seconds of the webapp startup time - roughly 1/3 of the total time - was due to File.getCanonicalPath(). That call seems to be especially expensive on Windows, because it iterates over each component of the path in question and does at least three system calls on each of them. So if you app is installed in a somewhat deeply nested path, getCanonicalPath() gets especially expensive. One can check this with a simple test app and running the ProcessMonitpr from SysInternals, which can nicely trace file system operations. I noticed that by far the most occurrences happened during webapp class loading. StandardRoot initializes resources to contain a DirResourceSet first and then a JarResourceSet per WEB-INF/lib jar file. Now any class loading (of not yet loaded classes) goes through the following stack (this is for TC 8.5): ... java.base@13.0.1/java.io.File.getCanonicalPath(File.java:618) org.apache.catalina.webresources.AbstractFileResourceSet.file(AbstractFileResourceSet.java:90) org.apache.catalina.webresources.DirResourceSet.getResource(DirResourceSet.java:101) org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281) org.apache.catalina.webresources.Cache.getResource(Cache.java:62) org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216) org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225) org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2279) org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:857) org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) - locked (a java.lang.Object@1797c4d) index 11 frame org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329) org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1182) ... In our case, the webapp does not have a WEB-INF/classes directory. Nevertheless TC always goes through the DirResourceSet running the expensive getCanonicalPath(), returning an EmptyResource and then looking for the class and finding it in one of the jar file resources. This happens about 5000 times (due to the number of classes to load). I know that we have to do this in general, because WEB-INF/classes have precedence over WEB-INF/lib, but maybe we could optimize in the case we are looking for a class but WEB-INF/classes does not even exist? Would this be a reasonable general optimization? Unfortunately the ResourceSet classes can not be easily extended. Since most of their fields are private, one would have to fork lots of code. It would be much easier to add the feature to the existing implementations. Regards, Rainer - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org