Hi, Paul:

Syntactically, the XPath expressions

*  retrieve a sequence of documents from the database
*  extract a sequence of nodes from the sequence of documents (in case 2)
*  filter to produce a final sequence by applying the predicate to each item

The engine tries to optimize XPath expressions by executing as much as of
the expression as possible as a query against indexes.  Not all expressions are
possible to optimize as a query. The optimizer may also miss some cases that
are possible to optimize.

The best practice is to use explicit queries instead of XPath expressions when
retrieving documents from the database.  That way, the use of indexes is
unambiguous. In addition, you have access to index mechanisms (such as fields)
that aren't available in predicates.

To put it the other way, the best practice is to use XPath expressions only to
traverse nodes after they have been retrieved from the database.

In the particular case, the equivalent query for XPath expression 2 would
resemble the following:

    let $ait := cts:search((), cts:and-query((
        cts:collection-query($mycol),
        cts:element-query(xs:QName("aaa"),
            cts:element-value-query(xs:QName("myelem"), "myval"))
            )
        )))

While the query is more verbose, it declares a carefully considered use of the
indexes.  That's a good thing for scalability and maintainability with most
production databases.

For more information, see:

    http://docs.marklogic.com/cts:collection-query
    http://docs.marklogic.com/cts:element-query
    http://docs.marklogic.com/cts:element-value-query


Hoping that helps,


Erik Hennum



________________________________________
From: general-boun...@developer.marklogic.com 
<general-boun...@developer.marklogic.com> on behalf of Paul M <pjm...@yahoo.com>
Sent: Tuesday, May 8, 2018 11:47:35 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] collection function searching

I have the following three queries I am comparing

declare variable  $my:col as xs:string...
(:
let $ait := collection()[.//myelem="myval"]
:)
(:
let $ait := collection($mycol)/aaa[.//myelem="myval"]
:)

let $ait := collection($mycol)[.//myelem=" myval"]

return $ait//someelem

There may be 10mil total documents in the repository.
There may be 1mil documents that have root element of aaa.
Note:myelem can/should only be in documents  with root element of aaa
There may be 3 mil total documents that are in mycol
There may be at most 100k documents that have root element of aaa which are in 
mycol

The following are the query-trace for the above three statement:
1st one: roughly 3000 fragments to filter. Seems reasonable  aaa documents that 
have myelem = myval
2nd statement: roughly 1000 fragments to filter. Seems reasonable - narrowing 
the search
Last statement: roughly 8 mil fragments to filter. Not certain why this occurs.

Any explanation to shed some light?

_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to