Re: [basex-talk] Reflect.forName() / Performance

2018-10-23 Thread Tom Rauchenwald
Hi Christian,

> Your hint was helpful: If a function signature is not registered yet
> when it is parsed, all kind of lookups are performed to locate the
> function. This is e.g. the case if the function is recursive, or if
> the invoked functions occurs after the currently parsed code.
>
> I have revised the code which is responsible for locating invoked
> functions: In the JavaFunction class [1], the lookup will be avoided
> if we can tell in advance that no Java class will be found.

I did some non-scientific testing, the results are quite awesome.
This is with one of our regression test suites, in this case we create
600 databases with exactly one (small) file, run a few queries against
it (typically 3), and our business logic interprets the results (the use
case is to check if some business rules are violated in the xml).

Here are the results, the numbers are seconds to execute the full test suite:

| | BaseX 8.6.1 | BaseX 9.0.2 | BaseX 9.1 Snapshot |
|-+-+-+|
|   1 |42.5 |44.5 |   20.3 |
|   2 |42.5 |44.4 |   18.1 |
|   3 |42.9 |44.6 |   18.1 |
|   4 |42.5 |44.8 |   18.2 |
|   5 |42.5 |43.8 |   18.6 |
|-+-+-+|
| Avg |42.6 |44.4 |   18.7 |

So our case is now more than twice as fast as before :)

I did some profiling/sampling with JProfiler as well, the Class.forName
call is now almost completely gone, while it was the most-hit method
before.

I guess that means that improving the caching of classes is not needed,
since you fixed the core problem.

Thanks a lot, this really helps us a lot!
-tom



Re: [basex-talk] Reflect.forName() / Performance

2018-10-19 Thread Christian Grün
Hi Tom,

Thanks for the XQuery example.

Your hint was helpful: If a function signature is not registered yet
when it is parsed, all kind of lookups are performed to locate the
function. This is e.g. the case if the function is recursive, or if
the invoked functions occurs after the currently parsed code.

I have revised the code which is responsible for locating invoked
functions: In the JavaFunction class [1], the lookup will be avoided
if we can tell in advance that no Java class will be found.

A new snapshot is online [2]; looking forward to your profiling results,
Christian

[1] 
https://github.com/BaseXdb/basex/commit/878c70fa3e1ef8ec6ab1f08b31f83e3157fc04c0#diff-2593cd6d13416ce7a77f1fba8f42a177R197
[2] http://files.basex.org/releases/latest/




> Given the following module (installed with module install foo.xqm):
>
> module namespace uc = 'http://unifits.com/common';
> declare function uc:remove-elements($input as element(), $remove-names as 
> xs:string*) as element() {
>   element {node-name($input) }
>   {$input/@*,
> for $child in $input/node()[not(local-name() = $remove-names)]
> return
>   if ($child instance of element())
>   then uc:remove-elements($child, $remove-names)
>   else $child
>}
> };
>
>
> if I start BaseXGui in debug mode, and set a breakpoint in
> Reflect.forName(), every time I execute a query such as
>
> import module namespace uc = 'http://unifits.com/common';
> /
>
> the breakpoint is hit, i.e. Class.forName() is called.
> I *think* this might have to do with the fact that the function above is
> recursive, but I have to admit that I don't really grasp the code that
> does the module loading/parsing.
>
> Thanks,
> -tom


Re: [basex-talk] Reflect.forName() / Performance

2018-10-17 Thread Tom Rauchenwald
Hi Christian,

thanks for your feedback, I hope you're doing well!

> I have some concerns that the caching of non-existing classes could be
> exploited and bloat the cache. Maybe we’d need to use WeakHashMap
> (and/or soft references) instead?

I didn't think about that, I'll try to come up with a better solution.

>> I'm not sure why BaseX tries to load our xqm as Java Modules, but
>> what I noticed is that Reflect.forName caches the positive case
>> (i.e., the class is found), but not the negative case (i.e., the class is 
>> not found).
>
> Sounds like an interesting finding; maybe there’s something we can
> optimize here. Could you possibly provide us a little self-contained
> example that demonstrates the behavior?

Sure. This is what I'm doing:

Given the following module (installed with module install foo.xqm):

module namespace uc = 'http://unifits.com/common';
declare function uc:remove-elements($input as element(), $remove-names as 
xs:string*) as element() {
  element {node-name($input) }
  {$input/@*,
for $child in $input/node()[not(local-name() = $remove-names)]
return
  if ($child instance of element())
  then uc:remove-elements($child, $remove-names)
  else $child
   }
};


if I start BaseXGui in debug mode, and set a breakpoint in
Reflect.forName(), every time I execute a query such as

import module namespace uc = 'http://unifits.com/common';
/

the breakpoint is hit, i.e. Class.forName() is called. 
I *think* this might have to do with the fact that the function above is
recursive, but I have to admit that I don't really grasp the code that
does the module loading/parsing.

Thanks,
-tom



Re: [basex-talk] Reflect.forName() / Performance

2018-10-17 Thread Christian Grün
Hi Tom,

Thanks for your observation and your code proposal; always welcome.

I have some concerns that the caching of non-existing classes could be
exploited and bloat the cache. Maybe we’d need to use WeakHashMap
(and/or soft references) instead?

> I'm not sure why BaseX tries to load our xqm as Java Modules, but what I 
> noticed is that Reflect.forName caches the positive case (i.e., the class is 
> found), but not the negative case (i.e., the class is not found).

Sounds like an interesting finding; maybe there’s something we can
optimize here. Could you possibly provide us a little self-contained
example that demonstrates the behavior?

Thanks in advance,
Christian



On Wed, Oct 17, 2018 at 9:53 AM Tom Rauchenwald (UNIFITS)
 wrote:
>
> Hi BaseX-Team,
>
>
> when profiling some of our tests i found that we spend some time in 
> Reflect.forName().
>
> We have 2 xquery modules in the repo (we don't call java code directly).
>
>
> I'm not sure why BaseX tries to load our xqm as Java Modules, but what I 
> noticed is that Reflect.forName caches the positive case (i.e., the class is 
> found), but not the negative case (i.e., the class is not found).
>
> I've changed the code to cache the negative case as well (see below), and 
> noticed an improvement of about 5 percent.
>
> Our tests create and query loads of small databases, so this is maybe quite 
> an artificial speedup.
>
>
> I could provide a PR if this is a worthwhile improvement in your opinion (and 
> if I'm not missing something obvious).
>
>
> We're still on BaseX 8.7.6 in case that matters, as far as I could see the 
> Code didn't change in BaseX 9.
>
>
> Thanks,
>
> Tom
>
>
> Code:
>
>
> public static Class forName(final String name) throws 
> ClassNotFoundException {
> Class c = CLASSES.get(name);
>
> if(c == null) {
>   if (CLASSES.containsKey(name)) {
> throw new ClassNotFoundException(name);
>   } else {
> try {
>   c = Class.forName(name);
> } catch (ClassNotFoundException e) {
>   CLASSES.put(name, null);
>   throw e;
> }
> if (!Modifier.isPublic(c.getModifiers())) throw new 
> ClassNotFoundException(name);
> CLASSES.put(name, c);
>   }
> }
> return c;
>   }
>


[basex-talk] Reflect.forName() / Performance

2018-10-17 Thread Tom Rauchenwald (UNIFITS)
Hi BaseX-Team,


when profiling some of our tests i found that we spend some time in 
Reflect.forName().

We have 2 xquery modules in the repo (we don't call java code directly).


I'm not sure why BaseX tries to load our xqm as Java Modules, but what I 
noticed is that Reflect.forName caches the positive case (i.e., the class is 
found), but not the negative case (i.e., the class is not found).

I've changed the code to cache the negative case as well (see below), and 
noticed an improvement of about 5 percent.

Our tests create and query loads of small databases, so this is maybe quite an 
artificial speedup.


I could provide a PR if this is a worthwhile improvement in your opinion (and 
if I'm not missing something obvious).


We're still on BaseX 8.7.6 in case that matters, as far as I could see the Code 
didn't change in BaseX 9.


Thanks,

Tom


Code:


public static Class forName(final String name) throws ClassNotFoundException 
{
Class c = CLASSES.get(name);

if(c == null) {
  if (CLASSES.containsKey(name)) {
throw new ClassNotFoundException(name);
  } else {
try {
  c = Class.forName(name);
} catch (ClassNotFoundException e) {
  CLASSES.put(name, null);
  throw e;
}
if (!Modifier.isPublic(c.getModifiers())) throw new 
ClassNotFoundException(name);
CLASSES.put(name, c);
  }
}
return c;
  }