Re: [basex-talk] Reflect.forName() / Performance
Hi Christian, > Your hint was helpful: If a function signature is not registered yet > when it is parsed, all kind of lookups are performed to locate the > function. This is e.g. the case if the function is recursive, or if > the invoked functions occurs after the currently parsed code. > > I have revised the code which is responsible for locating invoked > functions: In the JavaFunction class [1], the lookup will be avoided > if we can tell in advance that no Java class will be found. I did some non-scientific testing, the results are quite awesome. This is with one of our regression test suites, in this case we create 600 databases with exactly one (small) file, run a few queries against it (typically 3), and our business logic interprets the results (the use case is to check if some business rules are violated in the xml). Here are the results, the numbers are seconds to execute the full test suite: | | BaseX 8.6.1 | BaseX 9.0.2 | BaseX 9.1 Snapshot | |-+-+-+| | 1 |42.5 |44.5 | 20.3 | | 2 |42.5 |44.4 | 18.1 | | 3 |42.9 |44.6 | 18.1 | | 4 |42.5 |44.8 | 18.2 | | 5 |42.5 |43.8 | 18.6 | |-+-+-+| | Avg |42.6 |44.4 | 18.7 | So our case is now more than twice as fast as before :) I did some profiling/sampling with JProfiler as well, the Class.forName call is now almost completely gone, while it was the most-hit method before. I guess that means that improving the caching of classes is not needed, since you fixed the core problem. Thanks a lot, this really helps us a lot! -tom
Re: [basex-talk] Reflect.forName() / Performance
Hi Tom, Thanks for the XQuery example. Your hint was helpful: If a function signature is not registered yet when it is parsed, all kind of lookups are performed to locate the function. This is e.g. the case if the function is recursive, or if the invoked functions occurs after the currently parsed code. I have revised the code which is responsible for locating invoked functions: In the JavaFunction class [1], the lookup will be avoided if we can tell in advance that no Java class will be found. A new snapshot is online [2]; looking forward to your profiling results, Christian [1] https://github.com/BaseXdb/basex/commit/878c70fa3e1ef8ec6ab1f08b31f83e3157fc04c0#diff-2593cd6d13416ce7a77f1fba8f42a177R197 [2] http://files.basex.org/releases/latest/ > Given the following module (installed with module install foo.xqm): > > module namespace uc = 'http://unifits.com/common'; > declare function uc:remove-elements($input as element(), $remove-names as > xs:string*) as element() { > element {node-name($input) } > {$input/@*, > for $child in $input/node()[not(local-name() = $remove-names)] > return > if ($child instance of element()) > then uc:remove-elements($child, $remove-names) > else $child >} > }; > > > if I start BaseXGui in debug mode, and set a breakpoint in > Reflect.forName(), every time I execute a query such as > > import module namespace uc = 'http://unifits.com/common'; > / > > the breakpoint is hit, i.e. Class.forName() is called. > I *think* this might have to do with the fact that the function above is > recursive, but I have to admit that I don't really grasp the code that > does the module loading/parsing. > > Thanks, > -tom
Re: [basex-talk] Reflect.forName() / Performance
Hi Christian, thanks for your feedback, I hope you're doing well! > I have some concerns that the caching of non-existing classes could be > exploited and bloat the cache. Maybe we’d need to use WeakHashMap > (and/or soft references) instead? I didn't think about that, I'll try to come up with a better solution. >> I'm not sure why BaseX tries to load our xqm as Java Modules, but >> what I noticed is that Reflect.forName caches the positive case >> (i.e., the class is found), but not the negative case (i.e., the class is >> not found). > > Sounds like an interesting finding; maybe there’s something we can > optimize here. Could you possibly provide us a little self-contained > example that demonstrates the behavior? Sure. This is what I'm doing: Given the following module (installed with module install foo.xqm): module namespace uc = 'http://unifits.com/common'; declare function uc:remove-elements($input as element(), $remove-names as xs:string*) as element() { element {node-name($input) } {$input/@*, for $child in $input/node()[not(local-name() = $remove-names)] return if ($child instance of element()) then uc:remove-elements($child, $remove-names) else $child } }; if I start BaseXGui in debug mode, and set a breakpoint in Reflect.forName(), every time I execute a query such as import module namespace uc = 'http://unifits.com/common'; / the breakpoint is hit, i.e. Class.forName() is called. I *think* this might have to do with the fact that the function above is recursive, but I have to admit that I don't really grasp the code that does the module loading/parsing. Thanks, -tom
Re: [basex-talk] Reflect.forName() / Performance
Hi Tom, Thanks for your observation and your code proposal; always welcome. I have some concerns that the caching of non-existing classes could be exploited and bloat the cache. Maybe we’d need to use WeakHashMap (and/or soft references) instead? > I'm not sure why BaseX tries to load our xqm as Java Modules, but what I > noticed is that Reflect.forName caches the positive case (i.e., the class is > found), but not the negative case (i.e., the class is not found). Sounds like an interesting finding; maybe there’s something we can optimize here. Could you possibly provide us a little self-contained example that demonstrates the behavior? Thanks in advance, Christian On Wed, Oct 17, 2018 at 9:53 AM Tom Rauchenwald (UNIFITS) wrote: > > Hi BaseX-Team, > > > when profiling some of our tests i found that we spend some time in > Reflect.forName(). > > We have 2 xquery modules in the repo (we don't call java code directly). > > > I'm not sure why BaseX tries to load our xqm as Java Modules, but what I > noticed is that Reflect.forName caches the positive case (i.e., the class is > found), but not the negative case (i.e., the class is not found). > > I've changed the code to cache the negative case as well (see below), and > noticed an improvement of about 5 percent. > > Our tests create and query loads of small databases, so this is maybe quite > an artificial speedup. > > > I could provide a PR if this is a worthwhile improvement in your opinion (and > if I'm not missing something obvious). > > > We're still on BaseX 8.7.6 in case that matters, as far as I could see the > Code didn't change in BaseX 9. > > > Thanks, > > Tom > > > Code: > > > public static Class forName(final String name) throws > ClassNotFoundException { > Class c = CLASSES.get(name); > > if(c == null) { > if (CLASSES.containsKey(name)) { > throw new ClassNotFoundException(name); > } else { > try { > c = Class.forName(name); > } catch (ClassNotFoundException e) { > CLASSES.put(name, null); > throw e; > } > if (!Modifier.isPublic(c.getModifiers())) throw new > ClassNotFoundException(name); > CLASSES.put(name, c); > } > } > return c; > } >
[basex-talk] Reflect.forName() / Performance
Hi BaseX-Team, when profiling some of our tests i found that we spend some time in Reflect.forName(). We have 2 xquery modules in the repo (we don't call java code directly). I'm not sure why BaseX tries to load our xqm as Java Modules, but what I noticed is that Reflect.forName caches the positive case (i.e., the class is found), but not the negative case (i.e., the class is not found). I've changed the code to cache the negative case as well (see below), and noticed an improvement of about 5 percent. Our tests create and query loads of small databases, so this is maybe quite an artificial speedup. I could provide a PR if this is a worthwhile improvement in your opinion (and if I'm not missing something obvious). We're still on BaseX 8.7.6 in case that matters, as far as I could see the Code didn't change in BaseX 9. Thanks, Tom Code: public static Class forName(final String name) throws ClassNotFoundException { Class c = CLASSES.get(name); if(c == null) { if (CLASSES.containsKey(name)) { throw new ClassNotFoundException(name); } else { try { c = Class.forName(name); } catch (ClassNotFoundException e) { CLASSES.put(name, null); throw e; } if (!Modifier.isPublic(c.getModifiers())) throw new ClassNotFoundException(name); CLASSES.put(name, c); } } return c; }