Hi Axel, Thanks for the explanation and code to reproduce the problem.
I’m looking at it right now. Hannes > Am 27.06.2016 um 23:53 schrieb Axel Faust <axel.faus...@googlemail.com>: > > Hello, > > TL;DR : I use custom URL protocol schemes and stream handlers that are not > globally registered. This causes excessive handler resolution overhead in > URL.getURLStreamHandler() called implicitly in Source.baseURL(). I can't > find a way to avoid this overhead (in JDK 1.8.0_71) without two impossible > choices: complete refactoring or registering a JVM global > URLStreamHandlerFactory. > A test case for sampling the overhead is provided in > https://gist.github.com/AFaust/04ec0c65a560e306b6b547dcaf38fd21 > > > > This is a follow-up to my tweet of mine from yesterday: > https://twitter.com/ReluctantBird83/status/747145726703075328 > In this tweet I was commenting on an obversvation I made from CPU sampling > the current state of my Nashorn-based script engine for the open source ECM > platform Alfresco (https://github.com/AFaust/alfresco-nashorn-script-engine > ). > > What prompted the comment where the following hot spot methods from my > jvisualvm sampling session, when I was testing a trivial ReST endpoint > backed by a Nashorn-executed script: > > "Hot Spots - Method","Self Time [%]","Self Time","Self Time (CPU)","Total > Time","Total Time (CPU)","Samples" > "java.lang.invoke.LambdaForm$MH.771977685.linkToCallSite()","15.152575","793.365 > ms","793.365 ms","1126.483 ms","1126.483 ms","63" > "java.net.URL.<init>()","11.350913","594.316 ms","594.316 ms","594.316 > ms","594.316 ms","33" > "java.lang.Throwable.<init>()","7.248728","379.532 ms","379.532 > ms","379.532 ms","379.532 ms","21" > [...] > "jdk.nashorn.internal.runtime.Source.baseURL()","0.0","0.0 ms","0.0 > ms","594.316 ms","594.316 ms","33" > [...] > > The 1st and 3rd hot spot are directly related to frequently called code in > my scripts / my utilities and somewhat expected, but I was not expecting > the URL constructor to be up there. > The backtraces view of the snapshot showed Source.baseURL() as the > immediate and only caller of the URL constructor, even though I have other > calls in my code which apparently don't trigger the sampling threshold. > The total time per execution of the script is around 50-60ms with few > outliers up to 90-100ms (sampling started only after reasonably stable > state was reached). Sampling was limited specifically on the jdk.nashorn.*, > jdk.internal.* and de.* packages. > > A bit of background on my Alfresco Nashorn engine: > - embedded into a web application that may potentially run in Tomcat or JEE > servers (JBoss, WebSphere...) > - JavaScript in Alfresco is extensively used for embedded rules, policies > (event handling), ReST API endpoints and server-side UI pre-composition > - use of an AMD-like module system allowing flexible extension of script > API by 3rd party developers of Alfresco "addons" > - one file per module, lazily loaded when required by other module or > executed script > - frequently used "core" modules will be pre-loaded and cached on startup > - scripts are referenced via "logical" URLs using custom protocol schemes > to denote different script resolution and load scopes/mechanisms (example: > "webscript:///my/module/id" for a module in the lookup scope for ReST > endpoint scripts; some scripts may be user-managed within the content > repository / database itself) > - custom protocol schemes are handled by custom URL stream handlers *NOT* > globally registered (to avoid interfering with other web applications or > other URL-related functionality in the same JVM) > > > It turns out that the last two points are essential. I created a > generalised test case in a GitHub gist: > https://gist.github.com/AFaust/04ec0c65a560e306b6b547dcaf38fd21 > Essentially it is URL.getURLStreamHandler() which is responsible for the > overhead. The Source.baseURL() creates a "base" name from the source URL > and if the protocol is not "file://" then a new URL will be created. Since > I use custom URL stream handlers and have not registered a global stream > handler factory (and won't ever do so), the new URL will try to resolve the > handler via URL.getURLStreamHandler(), go through all the hoops and always > fail in the end. A failed resolution is never cached, so every time > Source.baseURL() is called this whole process / overhead is repeated. > > > I am currently trying to reduce all global overheads of my script engine > setup, but can't find a way to avoid this overhead without registering a > global URL stream factory, which is out of the question for various reasons > (web application; 3rd party loaders; engine-specific semantics) or > completely refactoring the engine so all scripts are copied to simple > "file://" before execution (requiring constant sync-checking with original > script in source storage location). > > Ideally, I would like the see options to provide both a base URL myself as > pre-resolved information via URLReader/Global.load() and register a custom > stream handler factory with my Nashorn engine instance. This would allow > "simple" loaders to use simple URL-Strings instead of real URL instances to > load script files via Global.load(), as well as "complex" loaders to > continue using state-ful custom URL stream handlers where necessary. And it > would allow Nashorn to resolve a potential custom URL stream handler before > relegating to default JVM global handling if no handler is found. > > I am sure I am not aware of all the implications - and certainly I am aware > that such a change in a core class might be impossible - but > URL.getURLStreamHandler() should really cache failed stream handler > resolutions and avoid repeating the entire lookup routine... > > > Kind regards, and sorry for this overly long "summary" > Axel Faust