Hi Martin and others,
I just tested what part the Pathfinder code generation plays and
generated MIL code for the Aug2008 (0.24), the Nov2008, and the
Feb2009 release branches. I ran all queries using the newest stable
version (Feb2009) on Mac OS X.
The observations are:
* The problem with gdk_heap.mx, mmap, and Mac OS X still resides (all
queries run in 10 seconds instead of 2 seconds)---Peter knows what I'm
talking about.
* Like Nils reported the queries are getting slower.
* The main performance decrease in my scenario is the document loading.
* The problem does not stem from Pathfinder's MIL code generation.
For more details see the attached file...
q0.aug2008.out |q0.nov2008.out |q0.feb2009.out
Shred 472.523 msec |Shred 232.504 msec |Shred 229.330 msec
Query 8910.591 msec |Query 7416.427 msec |Query 7328.854 msec
Print 1.112 msec |Print 0.375 msec |Print 0.502 msec
| |
Shred 5714.028 msec |Shred 5618.727 msec |Shred 5754.495 msec
Query 8440.298 msec |Query 12710.321 msec |Query 7253.548 msec
Print 0.298 msec |Print 0.294 msec |Print 0.275 msec
| |
Shred 9539.667 msec |Shred 16729.746 msec |Shred 9003.976 msec
Query 10319.638 msec |Query 10528.145 msec |Query 8899.959 msec
Print 0.603 msec |Print 0.307 msec |Print 0.829 msec
| |
Shred 11082.784 msec |Shred 10780.859 msec |Shred 10527.369 msec
Query 10123.990 msec |Query 9661.684 msec |Query 9794.755 msec
Print 0.378 msec |Print 0.295 msec |Print 0.292 msec
| |
Shred 10419.874 msec |Shred 10272.143 msec |Shred 9725.076 msec
Query 9559.761 msec |Query 9089.654 msec |Query 9240.117 msec
Print 0.359 msec |Print 0.299 msec |Print 0.314 msec
q1.aug2008.out |q1.nov2008.out |q1.feb2009.out
Shred 399.395 msec |Shred 396.367 msec |Shred 388.671 msec
Query 11097.419 msec |Query 9754.142 msec |Query 11086.588 msec
Print 0.657 msec |Print 0.560 msec |Print 0.959 msec
| |
Shred 5746.966 msec |Shred 6401.255 msec |Shred 5735.158 msec
Query 10093.130 msec |Query 10372.052 msec |Query 11100.466 msec
Print 0.341 msec |Print 0.803 msec |Print 0.519 msec
| |
Shred 10141.365 msec |Shred 10549.271 msec |Shred 9458.204 msec
Query 11842.561 msec |Query 12012.305 msec |Query 12793.418 msec
Print 0.304 msec |Print 0.312 msec |Print 0.284 msec
| |
Shred 10849.639 msec |Shred 10504.439 msec |Shred 11353.584 msec
Query 12063.871 msec |Query 11690.745 msec |Query 10453.142 msec
Print 0.319 msec |Print 0.767 msec |Print 0.283 msec
| |
Shred 9661.377 msec |Shred 10630.150 msec |Shred 10004.209 msec
Query 11429.812 msec |Query 10841.888 msec |Query 9518.676 msec
Print 0.390 msec |Print 0.333 msec |Print 0.284 msec
BTW: For todays' head version the results are even worse...
Jan
On Mar 9, 2009, at 18:08, Martin Kersten wrote:
For all interested. Indeed there are performance differences
between the various releases. Some can be traced back to
functional enhancements, others are a result from internal
administrative activities.
Recent experiments with the TPC-H scale-factor 2 on Feb 2009
branch show a performance degradation compared to Aug 2008,
as reported on the website.
It appears that some low-level actions related to allocation
of BATs and their management in memory-scarce situations are
debet to this situation.
Solutions are integrated with the HEAD, and may (depending
on our resources) be back propagated into a bugfix release
of the Feb 2009 version.
Nils Grimsmo wrote:
On Wed, Mar 04, 2009 at 11:08:40PM +0100, Jan Rittinger wrote:
Hi Nils,
I just ran your queries with the latest (not yet announced) Feb2009
release (http://monetdb.cwi.nl/downloads/sources/Feb2009/) and
received an answer in 1.5 (Q1) and 2.5 (Q2) seconds. If you still
have
problems with the new version, then please let us know.
Thank you for your answer, Jan. Feb2009 is indeed faster than
Nov2008,
but on my computer it is still slower than Aug2008. I also see some
strange and unfavorable performance characteristics on subsequent
queries
for Nov2008 and Feb2009 (see below).
Aug2008:
# MonetDB Server v4.24.0
# based on GDK v1.24.0
# PF/Tijah module v0.5.0 loaded. http://dbappl.cs.utwente.nl/pftijah
# MonetDB/XQuery module v0.24.0 loaded (default back-end is
'algebra')
Nov2008-SP2:
# MonetDB Server v4.26.4
# based on GDK v1.26.4
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/pftijah
# MonetDB/XQuery module v0.26.4 loaded (default back-end is
'algebra')
Feb2009:
# MonetDB Server v4.28.0
# Based on GDK v1.28.0
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/pftijah
# MonetDB/XQuery module v0.28.0 loaded (default back-end is
'algebra')
I run the queries multiple times in different scenarios.
A - Have just indexed the document, first run.
B - Second run (subsequent have similar timing).
C - Restart the server (Mserver), then first run.
D - Second run (subsequent have similar timing).
Query Q0:
Aug2008 Nov2008 Feb2009
A 1101 3687 1760
B 1031 4510 3015
C 1350 5216 3390
D 1035 12620 9533
Query Q1:
Aug2008 Nov2008 Feb2009
A 2161 15119 3013
B 2099 19292 4072
C 2526 18523 4567
D 2117 42555 10602
This seems very strange to me. The timings make sense for Aug2008,
where
the query is slightly slower right after restarting the server
(C). For
Nov2008 and Feb2009, the second (and subsequent) runs are slower
than the
first. How can this be? It can make sense for the first run after
restarting the server (C) to be slower (reading stuff from disk
etc.), but
why is the second (D) terribly slower? If I just keep running the
query,
the timings are similar to D.
Note: If I start mixing Q0 and Q1 after step D, they are both as
slow as
in step D.
I hope this feedback is helpful. Is there something strange with my
setup, or is this a "bug"? (My timings in step (A) seem similar to
Jan's
timings).
If I want to compare MonetDB/XQuery to other implementations in a
scientific paper, I typically want to warm up the system, then run
the
query multiple times to get an average timing. It is kind of
inconvenient
not to be able to close down Mserver between experiments...
P.S.: The E-Mail subject seems slightly off topic here :)
Yes, thought I'd avoid touching the mouse to copy the email
address. Cut
away In-Reply-To:, but forgot to change Subject:...
Thank you for your assistance!
Klem fra Nils
On Mar 4, 2009, at 16:30, Nils Grimsmo wrote:
Hi, I just upgraded from the Augst to the Noveber super-ball, and
the
performance has worsened badly.
Example queries on dblp.xml (441 MB):
Q0: count(/dblp//author[text()="Michael Stonebraker"])
Q1: count(/dblp/*/author[text()="Michael Stonebraker"])
Query time in milliseconds:
August November
Q0 1100 4867
Q1 3993 17999
I have compiled with --enable-optimise both times. I query with:
mclient --language=xquery --algebra --time < $QUERYFILE
Is this performance degradation expected? If so, why?
BTW: Is there any way of finding how much disk space a collection
uses?
Thank you for contributing free software!
Klem fra Nils
--
Jan Rittinger
Lehrstuhl Datenbanken und Informationssysteme
Wilhelm-Schickard-Institut für Informatik
Eberhard-Karls-Universität Tübingen
http://www-db.informatik.uni-tuebingen.de/team/rittinger
------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Monetdb-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-developers