Hi Christian,
I will try that. But first: I can confirm my suspicion. The offending
line ( let $to := //*[@id=$toId][1] )
takes about 20 msecs per hit , and since there are 249 hits, that means
249 * 20 = 4980 msecs in total, almost 5 seconds!
On a side note: I discovered that BaseX's query optimization is working
too good :-)
I wanted to profile the execution time of that offending line, so I
assigned the current time to variable $start before that line, and I
assigned the current time to variable $end after that line:
let $start := (current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')
let $to := //*[@id=$toId][1]
let $end := (current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div xs:dayTimeDuration('PT0.001S')
And then I assigned the difference, $end-$start to an attribute in the
result fragment.
But it appeared that BaseX pre-evaluated the $start and $end variables
and converted them into a constant, so I got the same $start and $end
value in every result, and the difference was always 0.
The only way I saw to prevent that from happening was using xquery:eval,
making it impossible for BaseX to pre-evaluate it:
let $start := xquery:eval("(current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div
xs:dayTimeDuration('PT0.001S')")
let $to := //*[@id=$toId][1]
let $end := xquery:eval("(current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div
xs:dayTimeDuration('PT0.001S')"
The complete profiling query:
(: list waters and where they stream to (if any):)
for $source in /descendant::sea | /descendant::river | /descendant::lake
let $toId := $source/to/@water
let $start := xquery:eval("(current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div
xs:dayTimeDuration('PT0.001S')")
let $to := //*[@id=$toId][1]
let $end := xquery:eval("(current-dateTime() -
xs:dateTime('1970-01-01T00:00:00-00:00')) div
xs:dayTimeDuration('PT0.001S')")
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
attribute took {$end - $start},
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}
For me the lesson is: uses as much predefined selections as possible,
particularly in "for" clauses.
Paul
Hi Paul,
thanks for your feedback. Are you working with 7.9? If it's not too
much of a hassle for you, I would be interested to hear if you get
better performance with the latest 8.0 snapshot?
Christian
[1] http://files.basex.org/releases/latest/
On Mon, Aug 4, 2014 at 11:57 AM, Paul Swennenhuis <p...@swennenhuis.nl> wrote:
Hi Christian,
Sorry, also doesn't improve performance.
I even tried to copy the optimized line for the selection, as found in the
Query Info pane:
(: list waters and where they stream to (if any):)
for $source in ((db:open-pre("facts",0)/descendant::*:sea union
db:open-pre("facts",0)/descendant::*:river union
db:open-pre("facts",0)/descendant::*:lake))
let $toId := $source/to/@water
let $to := //*[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}
No improvement.
The problem seems to be in the line that assigns the $to variable
If I reuse the main node selection there the query executes fast.
Like such:
(: list waters and where they stream to (if any):)
let $sources := /descendant::sea | /descendant::river | /descendant::lake
for $source in $sources
let $toId := $source/to/@water
let $to := $sources[@id=$toId][1]
let $name := if (empty($to)) then "none" else $to/local-name()
return
element water {
element {$source/local-name()} {data($source/@name)},
if (not($name="none"))then
element streamsTo {
attribute {$name} {data($to/@name)}
}
else ()
}
The original line, let $to := //*[@id=$toId][1], apparently is very
expensive.
I could do some testing with the profiling tools to see if I'm right.
Paul
Hi Paul,
//(sea|river|lake)
Due to the (somewhat peculiar) semantics of XPath, this path is identical
to...
/descendant-or-self::node()/
(child::sea | child::river | child::lake)
...and it creates a massive amount of intermediate results. You could
try to rewrite it to...
/descendant::sea | /descendant::river |
/descendant::lake
...or...
/descendant::*[local-name() = ('sea', 'river', 'lake')]
...and I will try to tweak our optimizer to automatically do this for
you in future (it already works for single steps).
Christian