Hi Fabrice,
Thanks for your contribution.
The collection is the Facts database (factbook.xml) found in the
distribution of BaseX.
It's the same collection as used in Query 2.
And yes, I did try to run the slow query in the GUI. In fact, that is
the only place where I ran it, with the Facts database opened (it will
yield an error if the database is not opened since it does not specify a
context).
Paul
On 8/4/2014 11:35 AM, Fabrice Etanchaud wrote:
Dear Paul,
Is it a big collection ? Could the difference be in opening the
collection ?
Did you try to run the slow request for example in the GUI, with the
collection already opened ?
Best regards,
Fabrice
*De :*basex-talk-boun...@mailman.uni-konstanz.de
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] *De la part de*
Paul Swennenhuis
*Envoyé :* lundi 4 août 2014 11:22
*À :* H. Verweij; BaseX
*Objet :* Re: [basex-talk] Same query, huge difference in performance
Hi Huub,
Thank you for your reply.
I tried your suggestions, but it does not make any difference.
I changed Query 1 to this:
(: list waters and where they stream to (if any):)
for $source in //(sea|river|lake)
let $toId := $source/to/@water
let $to := //*[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}
but there is no performance gain. The query still executes at least 10
times slower than Query 2.
Thanks for the empty($to) suggestion.
As for the recursive algorithm: in the meantime I wrote the query for
that and it works like a charm!
Paul
Hi Paul,
Op 4 aug. 2014, om 09:27 heeft Paul Swennenhuis <
p...@swennenhuis.nl <mailto:p...@swennenhuis.nl> > het
volgende geschreven:
Listings:
Query1
(: list waters and where they stream to (if any):)
for $source in //(sea|river|lake)
let $toId := $source/to/@water
let $to := (//sea|//river|//lake)[@id=$toId][1]
You start to search for “sea” elements at the very top of the db,
then, for “river” elements you start to search at the very top of
the db, then, for “lake” elements you start to search at the very
top of the db. And you do this for every $source you process. This
is different from "//(sea|river|lake)" where you start at the top
(once) and then match sea, river or lake elements. In the second
query you find all sea, river and lake elements once and then use
that sequence to search in, that would be (much) faster.
It might even be faster to just search all element and filter on
@id (BaseX can then use the attribute index and just needs to use
it once, probably), f.i.:
let $toWaters := //*[@id = $toId]
and $toWaters contains all waters (sea, river and lake elements)
the $source streams to. Add a [1] if you just need the first one
like you did. (If you know that all waters mentioned in $source/to
exist in your db, wouldn't it be better to restrict $toId instead
of $to, i.e. just use the first $source/to element?)
let $name := if (empty($to/local-name())) then “none” else
$to/local-name()
I am not sure I understand but wouldn't empty($to) do the trick?
return
element water {
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}
As a side-question: I want to extend the query to make it
recursive: river “Bahr el-Djebel” streams into river “White
Nile” streams into river “Nile” streams into sea
“Mediterranean Sea”
I think I can find out how to do that, but how can I optimize
the recursion process? Would a recursive function be efficient?
Yes, that would do the trick. Generally, tail-recursiveness is a
good thing, but in this case it wouldn't matter much probably.
Just watch out for those weird rivers that flow back into the lake
they originate from ;-).
Regards,
Huib Verweij.