Re: [basex-talk] Same query, huge difference in performance

Paul Swennenhuis Mon, 04 Aug 2014 02:42:14 -0700

Hi Fabrice,

Thanks for your contribution.

The collection is the Facts database (factbook.xml) found in thedistribution of BaseX.

It's the same collection as used in Query 2.

And yes, I did try to run the slow query in the GUI. In fact, that isthe only place where I ran it, with the Facts database opened (it willyield an error if the database is not opened since it does not specify acontext).


Paul

On 8/4/2014 11:35 AM, Fabrice Etanchaud wrote:


Dear Paul,

Is it a big collection ? Could the difference be in opening thecollection ?

Did you try to run the slow request for example in the GUI, with thecollection already opened ?


Best regards,

Fabrice

*De :*basex-talk-boun...@mailman.uni-konstanz.de[mailto:basex-talk-boun...@mailman.uni-konstanz.de] *De la part de*Paul Swennenhuis

*Envoyé :* lundi 4 août 2014 11:22
*À :* H. Verweij; BaseX
*Objet :* Re: [basex-talk] Same query, huge difference in performance

Hi Huub,

Thank you for your reply.
I tried your suggestions, but it does not make any difference.
I changed Query 1 to this:

(: list waters and where they stream to (if any):)
for $source in  //(sea|river|lake)
let $toId := $source/to/@water
let $to := //*[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
  element {$source/local-name()} {data($source/@name)},
  if (not($name="none"))then
  element streamsTo {
    attribute {$name} {data($to/@name)}
  }
  else ()
}

but there is no performance gain. The query still executes at least 10times slower than Query 2.


Thanks for the empty($to) suggestion.

As for the recursive algorithm: in the meantime I wrote the query forthat and it works like a charm!


Paul

    Hi Paul,

        Op 4 aug. 2014, om 09:27 heeft Paul Swennenhuis <
        p...@swennenhuis.nl <mailto:p...@swennenhuis.nl> > het
        volgende geschreven:
        Listings:

        Query1

        (: list waters and where they stream to (if any):)
        for $source in //(sea|river|lake)
        let $toId := $source/to/@water
        let $to := (//sea|//river|//lake)[@id=$toId][1]


    You start to search for “sea” elements at the very top of the db,
    then, for “river” elements you start to search at the very top of
    the db, then, for “lake” elements you start to search at the very
    top of the db. And you do this for every $source you process. This
    is different from "//(sea|river|lake)" where you start at the top
    (once) and then match sea, river or lake elements. In the second
    query you find all sea, river and lake elements once and then use
    that sequence to search in, that would be (much) faster.

    It might even be faster to just search all element and filter on
    @id (BaseX can then use the attribute index and just needs to use
    it once, probably), f.i.:

    let $toWaters := //*[@id = $toId]

    and $toWaters contains all waters (sea, river and lake elements)
    the $source streams to. Add a [1] if you just need the first one
    like you did. (If you know that all waters mentioned in $source/to
    exist in your db, wouldn't it be better to restrict $toId instead
    of $to, i.e. just use the first $source/to element?)

        let $name := if (empty($to/local-name())) then “none” else
        $to/local-name()

    I am not sure I understand but wouldn't empty($to) do the trick?

        return
        element water {
        element {$source/local-name()} {data($source/@name)},
        if (not($name="none"))then
        element streamsTo {
        attribute {$name} {data($to/@name)}
        }
        else ()
        }

        As a side-question: I want to extend the query to make it
        recursive: river “Bahr el-Djebel” streams into river “White
        Nile” streams into river “Nile” streams into sea
        “Mediterranean Sea”
        I think I can find out how to do that, but how can I optimize
        the recursion process? Would a recursive function be efficient?


    Yes, that would do the trick. Generally, tail-recursiveness is a
    good thing, but in this case it wouldn't matter much probably.
    Just watch out for those weird rivers that flow back into the lake
    they originate from ;-).

    Regards,

    Huib Verweij.

Re: [basex-talk] Same query, huge difference in performance

Reply via email to