Doh!  I even read the guide section on optimizing order-by and missed the
specifics of cts:index-order.  I'll give that a shot.

Thanks!

-Bob

On Fri, Oct 30, 2015 at 3:35 PM, Justin Makeig <[email protected]>
wrote:

> You can be explicit with cts:index-order() <
> http://docs.marklogic.com/cts:index-order>.
>
> Justin
>
>
> > On Oct 30, 2015, at 12:29 PM, Bob H. <[email protected]> wrote:
> >
> > All,
> >
> > We know that ML will optimize an "order by" clause if it can with a
> range index. I've done this with element, attribute, and path range indexes
> on many occasions.  However, I'm faced with a scenario for which I can't
> seem to find a straightforward optimization.  Your help would be
> appreciated!
> >
> > Consider a database full of documents.  The documents all contain an
> element with local name "created", but the namespaces vary.  While
> normalization may be an option for the future, I'm trying to see if there
> is a way to avoid that for now.
> >
> > So, consider the following three documents:
> > <a:root xmlns:a="a"> bob
> <a:created>2015-10-28T00:00:00Z</a:created></a:root>
> > <b:root xmlns:b="b"> bob
> <b:created>2015-10-29T00:00:00Z</b:created></b:root>
> > <c:root xmlns:c="c"> bob
> <c:created>2015-10-30T00:00:00Z</c:created></c:root>
> >
> > I can define a path range index using the wildcarded path:
> > //*:created
> >
> > The path range index clearly works as expected, because I can retrieve
> all three values above via this cts:values statement:
> > cts:values(cts:path-reference('//*:created'))
> >
> > However, if I run a cts:search to find the three docs above and order
> the results by this path, query-trace shows that the sort is not optimized:
> > for $item in
> > cts:search(doc(), 'bob')
> > order by $item//*:created
> > return $item
> >
> > The query-trace output does not show that a range index was used to
> optimize the order by:
> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Analyzing path for
> search: fn:doc()
> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Step 1 is
> searchable: fn:doc()
> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Path is fully
> searchable.
> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Gathering
> constraints.
> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Search query
> contributed 1 constraint: cts:word-query("bobh", ("lang=en"), 1)
> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Executing search.
> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Selected 3
> fragments to filter.
> >
> > I've tried a number of other approaches to the path range index,
> including paths that did not contain wildcards but instead included the
> specific namespace possibilities:
> > //(a:created|b:created|c:created)
> > /(a:root|b:root|c:root)/(a:created|b:created|c:created)
> >
> > I've considered creating a field encompassing the three elements and a
> field range index, but I don't know of a way to construct the order by
> clause such that the field range index would be used.  Is that even
> possible?
> >
> > Any suggestions are welcome!  If this sort of approach is a dead end, we
> will need to wait until we have the opportunity to normalize the data to
> optimize this.
> >
> > Thank you!
> >
> > -Bob
> > _______________________________________________
> > General mailing list
> > [email protected]
> > Manage your subscription at:
> > http://developer.marklogic.com/mailman/listinfo/general
>
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to