Re: [basex-talk] Performance issue with BaseX CLI

2024-04-19 Thread Liam R. E. Quin
On Fri, 2024-04-19 at 10:45 +0200, ANDRADE Antonio wrote:
> Hie,
>  
> For the purposes of European Water Framework Directive reporting, I
> compared the performances of the Saxon and BaseX XQuery engines.

First, you should consider (as i think Martin said) the Java runtime
startup time, typically a second or so.

Second, BaseX is a database. If you will process the same document many
times, first load it into a database and then use the Python BaseX
client. This will avoid startup time, and, more importantly, will allow
BaseX to make use of database indexes.

If you will only process any given document once, then Saxon may well
be the appropriate tool.

liam


-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org


Re: [basex-talk] Performance issue with BaseX CLI

2024-04-19 Thread ANDRADE Antonio
Thanks for your feedback. I haven't found a Python API around the BaseX 
client. For convenience, I carried out my first tests with a bash script. In 
the meantime, I carried out other tests by creating Java processes from a 
Python script. I observe roughly identical performance differences. The 
Python/bash difference for the calling script does not seem to explain the 
observed performance differences.



De : Hans-Juergen Rennau 
Envoyé : vendredi 19 avril 2024 11:25
À : basex-talk@mailman.uni-konstanz.de; ANDRADE Antonio 

Objet : Re: [basex-talk] Performance issue with BaseX CLI



Hi Antonio,



my experience is very different - quite comparable performance, except for 
very specific cases, e.g. massive use of fn:idref(). Furthermore, the 
performance of BaseX is often so stupendous that an improvement by an order 
of magnitude (not to mention two) appears to me very difficult to imagine.



It makes me suspicious that one of your scripts is .py, the other .sh. I 
believe the scripts used for comparing should be absolutely analogous.



Kind regards,

Hans-Jürgen

Am Freitag, 19. April 2024 um 10:46:00 MESZ hat ANDRADE Antonio 
mailto:antonio.andr...@ofb.gouv.fr> > 
Folgendes geschrieben:





Hie,



For the purposes of European Water Framework Directive reporting, I compared 
the performances of the Saxon and BaseX XQuery engines. I observe a 
performance gap of a factor of 100 to 200 depending on the use case (see 
functions test_xquery_monitoring() and test_xquery_multischema_2022() in 
scripts test_saxoncee.py and test_basex.sh available at 
https://outil-transferts.ofb.fr/?107ae461a144d0b 

 
 ) Can you please help me understand the reasons for such gaps ?



Thanks in advance,

Antonio Andrade

Date engineer





Re: [basex-talk] Performance issue with BaseX CLI

2024-04-19 Thread Martin Honnen


Am 19.04.2024 um 10:45 schrieb ANDRADE Antonio:


Hie,

For the purposes of European Water Framework Directive reporting, I
compared the performances of the Saxon and BaseX XQuery engines. I
observe a performance gap of a factor of 100 to 200 depending on the
use case (see functions test_xquery_monitoring() and
test_xquery_multischema_2022() in scripts test_saxoncee.py and
test_basex.sh available at
https://outil-transferts.ofb.fr/?107ae461a144d0b) Can you please help
me understand the reasons for such gaps ?


I haven't tried to look at your files either but would also say that
SaxonC from Python is usually faster than Saxon Java when run from a
shell script so some difference you might see is just the advantage of
the AOT compiled SaxonC over running a classic Java app from a shell
script where JVM start up/warm up is making a single run of code seem
always relatively slow.


Re: [basex-talk] Performance issue with BaseX CLI

2024-04-19 Thread Hans-Juergen Rennau
 Hi Antonio,
my experience is very different - quite comparable performance, except for very 
specific cases, e.g. massive use of fn:idref(). Furthermore, the performance of 
BaseX is often so stupendous that an improvement by an order of magnitude (not 
to mention two) appears to me very difficult to imagine.
It makes me suspicious that one of your scripts is .py, the other .sh. I 
believe the scripts used for comparing should be absolutely analogous.

Kind regards,Hans-JürgenAm Freitag, 19. April 2024 um 10:46:00 MESZ hat 
ANDRADE Antonio  Folgendes geschrieben:  
 
 
Hie,

  

For the purposes of European Water Framework Directive reporting, I compared 
the performances of the Saxon and BaseX XQuery engines. I observe a performance 
gap of a factor of 100 to 200 depending on the use case (see functions 
test_xquery_monitoring() and test_xquery_multischema_2022() in scripts 
test_saxoncee.py and test_basex.sh available at 
https://outil-transferts.ofb.fr/?107ae461a144d0b) Can you please help me 
understand the reasons for such gaps ?

  

Thanks in advance,

Antonio Andrade

Date engineer

  
  

[basex-talk] Performance issue with BaseX CLI

2024-04-19 Thread ANDRADE Antonio
Hie,

 

For the purposes of European Water Framework Directive reporting, I
compared the performances of the Saxon and BaseX XQuery engines. I observe
a performance gap of a factor of 100 to 200 depending on the use case (see
functions test_xquery_monitoring() and test_xquery_multischema_2022() in
scripts test_saxoncee.py and test_basex.sh available at
https://outil-transferts.ofb.fr/?107ae461a144d0b) Can you please help me
understand the reasons for such gaps ?

 

Thanks in advance,

Antonio Andrade

Date engineer