Hi Christian,

I will try that. But first: I can confirm my suspicion. The offending line ( let $to := //*[@id=$toId][1] ) takes about 20 msecs per hit , and since there are 249 hits, that means 249 * 20 = 4980 msecs in total, almost 5 seconds!

On a side note: I discovered that BaseX's query optimization is working too good :-) I wanted to profile the execution time of that offending line, so I assigned the current time to variable $start before that line, and I assigned the current time to variable $end after that line:

let $start := (current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')
let $to := //*[@id=$toId][1]
let $end := (current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')

And then I assigned the difference, $end-$start to an attribute in the result fragment. But it appeared that BaseX pre-evaluated the $start and $end variables and converted them into a constant, so I got the same $start and $end value in every result, and the difference was always 0.

The only way I saw to prevent that from happening was using xquery:eval, making it impossible for BaseX to pre-evaluate it:

let $start := xquery:eval("(current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')")
let $to := //*[@id=$toId][1]
let $end := xquery:eval("(current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')"

The complete profiling query:

(: list waters and where they stream to (if any):)
for $source in  /descendant::sea | /descendant::river | /descendant::lake
let $toId := $source/to/@water
let $start := xquery:eval("(current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')")
let $to := //*[@id=$toId][1]
let $end := xquery:eval("(current-dateTime() - xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')")
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
  attribute took {$end - $start},
  element {$source/local-name()} {data($source/@name)},
  if (not($name="none"))then
  element streamsTo {
    attribute {$name} {data($to/@name)}
  }
  else ()
}


For me the lesson is: uses as much predefined selections as possible, particularly in "for" clauses.

Paul


Hi Paul,

thanks for your feedback. Are you working with 7.9? If it's not too
much of a hassle for you, I would be interested to hear if you get
better performance with the latest 8.0 snapshot?

Christian

[1] http://files.basex.org/releases/latest/



On Mon, Aug 4, 2014 at 11:57 AM, Paul Swennenhuis <p...@swennenhuis.nl> wrote:
Hi Christian,

Sorry, also doesn't improve performance.
I even tried to copy the optimized line for the selection, as found in the
Query Info pane:



(: list waters and where they stream to (if any):)
for $source in  ((db:open-pre("facts",0)/descendant::*:sea union
db:open-pre("facts",0)/descendant::*:river union
db:open-pre("facts",0)/descendant::*:lake))

let $toId := $source/to/@water
let $to := //*[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
   element {$source/local-name()} {data($source/@name)},
   if (not($name="none"))then
   element streamsTo {
     attribute {$name} {data($to/@name)}
   }
   else ()
}

No improvement.
The problem seems to be in the line that assigns the $to variable
If I reuse the main node selection there the query executes fast.
Like such:



(: list waters and where they stream to (if any):)
let $sources :=  /descendant::sea | /descendant::river | /descendant::lake
for $source in $sources
let $toId := $source/to/@water
let $to := $sources[@id=$toId][1]

let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
   element {$source/local-name()} {data($source/@name)},
   if (not($name="none"))then
   element streamsTo {
     attribute {$name} {data($to/@name)}
   }
   else ()
}

The original line, let $to := //*[@id=$toId][1], apparently is very
expensive.
I could do some testing with the profiling tools to see if I'm right.

Paul


Hi Paul,

//(sea|river|lake)
Due to the (somewhat peculiar) semantics of XPath, this path is identical
to...

    /descendant-or-self::node()/
      (child::sea | child::river | child::lake)

...and it creates a massive amount of intermediate results. You could
try to rewrite it to...

    /descendant::sea | /descendant::river |
      /descendant::lake

...or...

    /descendant::*[local-name() = ('sea', 'river', 'lake')]

...and I will try to tweak our optimizer to automatically do this for
you in future (it already works for single steps).

Christian




Reply via email to