Re: [basex-talk] Spectactularly slow performance with db:open vs. db:text

2016-02-15 Thread Christian Grün
> Your second query executes in a fraction of a second - but only because the 
> optimizer successfully rewrites it to use the TEXT index.

Exactly. A single context item reference (".") can be rewritten for
index if the database statistics indicate that the addressed element –
in your case order-id – has no other descendant elements. If it did
have, you would get different results: see e.g.:

  XY = 'XY' → true

In your case, I assume that your database statistics have got outdated
(e.g. due to updates), or there may be order-id elements with elements
as children.


Re: [basex-talk] Spectactularly slow performance with db:open vs. db:text

2016-02-15 Thread Hondros, Constantine (ELS-AMS)
Hello Christian,

Your second query executes in a fraction of a second - but only because the 
optimizer successfully rewrites it to use the TEXT index. When I remove the 
TEXT index so that the optimizer uses db:open-pre(), then the query takes some 
88 minutes to run. (I did improve it to use a direct path).

On the other hand, I did some benchmarking against different sized lookups. 
I've discovered that I am averageing just over 10 lookups per second against 
DB2 using db:open-pre, which is independent of the number of lookups performed 
per query. Presumably this is just an expensive sort of query to repeat 
thousands of times, so the major learning for me is to double-check that 
queries are rewritten against the TEXT index where possible.

Thanks for helping me think this through.

C.

-Original Message-
From: Christian Grün [mailto:christian.gr...@gmail.com]
Sent: 15 February 2016 10:44
To: Hondros, Constantine (ELS-AMS)
Cc: BaseX
Subject: Re: [basex-talk] Spectactularly slow performance with db:open vs. 
db:text

Hi Constantine,

> for $a in (db:open('DB1')/item/order-id) return
>   if (db:open('DB2')//order-id[. = $a]) then
> $a
>   else
> ()

Do some of the order-id elements contain descendant elements?

  db:open('DB1')/item/order-id[*]

If yes, the following query might be faster:

  for $a in (db:open('DB1')/item/order-id)
  return
if (db:open('DB2')//order-id[text() = $a]) then
  $a
else
  ()

Here is another way to rewrite the query:

  for $a in (db:open('DB1')/item/order-id)
  where db:open('DB2')//order-id[text() = $a]
  return $a




>
>
> Note that the optimized query uses db:open-pre to access DB2. When I
> re-write the query myself to use the TEXT index then performance is
> excellent. But why such a difference?
>
>
>
>
>
> Query 2 [returns in 0.3 second]
>
> 
>
>
>
> for $a in (db:open('DB1')/item/order-id)
>
> return
>
>   if (db:text('DB2', $a)/parent::order-id) then
>
> $a
>
>   else
>
>()
>
>
> 
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33156677, Registered in The Netherlands.



Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
Netherlands, Registration No. 33156677, Registered in The Netherlands.


Re: [basex-talk] Spectactularly slow performance with db:open vs. db:text

2016-02-15 Thread Christian Grün
Hi Constantine,

> for $a in (db:open('DB1')/item/order-id)
> return
>   if (db:open('DB2')//order-id[. = $a]) then
> $a
>   else
> ()

Do some of the order-id elements contain descendant elements?

  db:open('DB1')/item/order-id[*]

If yes, the following query might be faster:

  for $a in (db:open('DB1')/item/order-id)
  return
if (db:open('DB2')//order-id[text() = $a]) then
  $a
else
  ()

Here is another way to rewrite the query:

  for $a in (db:open('DB1')/item/order-id)
  where db:open('DB2')//order-id[text() = $a]
  return $a




>
>
> Note that the optimized query uses db:open-pre to access DB2. When I
> re-write the query myself to use the TEXT index then performance is
> excellent. But why such a difference?
>
>
>
>
>
> Query 2 [returns in 0.3 second]
>
> 
>
>
>
> for $a in (db:open('DB1')/item/order-id)
>
> return
>
>   if (db:text('DB2', $a)/parent::order-id) then
>
> $a
>
>   else
>
>()
>
>
> 
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33156677, Registered in The Netherlands.


[basex-talk] Spectactularly slow performance with db:open vs. db:text

2016-02-15 Thread Hondros, Constantine (ELS-AMS)
Hello Basexers,

I'm getting such a low performance on a relatively simple join between two 
databases that I feel there must be something going wrong here. I can provide 
the sources if necessary, but basically DB1 is 26 MB, about 80,000 small 
documents; DB2 is 47 MB, about 18,500 small documents. I'm using 8.4, by the 
way, haven't tested on other releases.

Query 1 [returns in 144 minutes]


for $a in (db:open('DB1')/item/order-id)
return
  if (db:open('DB2')//order-id[. = $a]) then
$a
  else
()

Note that the optimized query uses db:open-pre to access DB2. When I re-write 
the query myself to use the TEXT index then performance is excellent. But why 
such a difference?


Query 2 [returns in 0.3 second]


for $a in (db:open('DB1')/item/order-id)
return
  if (db:text('DB2', $a)/parent::order-id) then
$a
  else
   ()



Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
Netherlands, Registration No. 33156677, Registered in The Netherlands.