Re: [basex-talk] Same query, huge difference in performance

Fabrice Etanchaud Mon, 04 Aug 2014 02:36:08 -0700

Dear Paul,

Is it a big collection ? Could the difference be in opening the collection ?
Did you try to run the slow request for example in the GUI, with the collection 
already opened ?


Best regards,
Fabrice

De : basex-talk-boun...@mailman.uni-konstanz.de 
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de Paul 
Swennenhuis
Envoyé : lundi 4 août 2014 11:22
À : H. Verweij; BaseX
Objet : Re: [basex-talk] Same query, huge difference in performance

Hi Huub,

Thank you for your reply.
I tried your suggestions, but it does not make any difference.
I changed Query 1 to this:

(: list waters and where they stream to (if any):)
for $source in  //(sea|river|lake)
let $toId := $source/to/@water
let $to := //*[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
  element {$source/local-name()} {data($source/@name)},
  if (not($name="none"))then
  element streamsTo {
    attribute {$name} {data($to/@name)}
  }
  else ()
}

but there is no performance gain. The query still executes at least 10 times 
slower than Query 2.

Thanks for the empty($to) suggestion.

As for the recursive algorithm: in the meantime I wrote the query for that and 
it works like a charm!

Paul

Hi Paul,
Op 4 aug. 2014, om 09:27 heeft Paul Swennenhuis < 
p...@swennenhuis.nl<mailto:p...@swennenhuis.nl> > het volgende geschreven:
Listings:

Query1

(: list waters and where they stream to (if any):)
for $source in //(sea|river|lake)
let $toId := $source/to/@water
let $to := (//sea|//river|//lake)[@id=$toId][1]

You start to search for “sea” elements at the very top of the db, then, for 
“river” elements you start to search at the very top of the db, then, for 
“lake” elements you start to search at the very top of the db. And you do this 
for every $source you process. This is different from "//(sea|river|lake)" 
where you start at the top (once) and then match sea, river or lake elements. 
In the second query you find all sea, river and lake elements once and then use 
that sequence to search in, that would be (much) faster.

It might even be faster to just search all element and filter on @id (BaseX can 
then use the attribute index and just needs to use it once, probably), f.i.:



let $toWaters := //*[@id = $toId]



and $toWaters contains all waters (sea, river and lake elements) the $source 
streams to. Add a [1] if you just need the first one like you did. (If you know 
that all waters mentioned in $source/to exist in your db, wouldn't it be better 
to restrict $toId instead of $to, i.e. just use the first $source/to element?)


let $name := if (empty($to/local-name())) then “none” else $to/local-name()



I am not sure I understand but wouldn't empty($to) do the trick?


return
element water {
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}

As a side-question: I want to extend the query to make it recursive: river 
“Bahr el-Djebel” streams into river “White Nile” streams into river “Nile” 
streams into sea “Mediterranean Sea”
I think I can find out how to do that, but how can I optimize the recursion 
process? Would a recursive function be efficient?

Yes, that would do the trick. Generally, tail-recursiveness is a good thing, 
but in this case it wouldn't matter much probably. Just watch out for those 
weird rivers that flow back into the lake they originate from ;-).

Regards,



Huib Verweij.

Re: [basex-talk] Same query, huge difference in performance

Reply via email to