Ah, I see now why I missed this.  We're on ML7.x and an upgrade to ML8
right now isn't going to happen.  So, while cts:index-order will very
likely be the approach we use down the road, I've come up with an
alternate, relevance-ranking-based, approach for now.

Thanks for the tip!

-Bob

On Sat, Oct 31, 2015 at 12:39 AM, Bob H. <[email protected]> wrote:

> Doh!  I even read the guide section on optimizing order-by and missed the
> specifics of cts:index-order.  I'll give that a shot.
>
> Thanks!
>
> -Bob
>
> On Fri, Oct 30, 2015 at 3:35 PM, Justin Makeig <
> [email protected]> wrote:
>
>> You can be explicit with cts:index-order() <
>> http://docs.marklogic.com/cts:index-order>.
>>
>> Justin
>>
>>
>> > On Oct 30, 2015, at 12:29 PM, Bob H. <[email protected]> wrote:
>> >
>> > All,
>> >
>> > We know that ML will optimize an "order by" clause if it can with a
>> range index. I've done this with element, attribute, and path range indexes
>> on many occasions.  However, I'm faced with a scenario for which I can't
>> seem to find a straightforward optimization.  Your help would be
>> appreciated!
>> >
>> > Consider a database full of documents.  The documents all contain an
>> element with local name "created", but the namespaces vary.  While
>> normalization may be an option for the future, I'm trying to see if there
>> is a way to avoid that for now.
>> >
>> > So, consider the following three documents:
>> > <a:root xmlns:a="a"> bob
>> <a:created>2015-10-28T00:00:00Z</a:created></a:root>
>> > <b:root xmlns:b="b"> bob
>> <b:created>2015-10-29T00:00:00Z</b:created></b:root>
>> > <c:root xmlns:c="c"> bob
>> <c:created>2015-10-30T00:00:00Z</c:created></c:root>
>> >
>> > I can define a path range index using the wildcarded path:
>> > //*:created
>> >
>> > The path range index clearly works as expected, because I can retrieve
>> all three values above via this cts:values statement:
>> > cts:values(cts:path-reference('//*:created'))
>> >
>> > However, if I run a cts:search to find the three docs above and order
>> the results by this path, query-trace shows that the sort is not optimized:
>> > for $item in
>> > cts:search(doc(), 'bob')
>> > order by $item//*:created
>> > return $item
>> >
>> > The query-trace output does not show that a range index was used to
>> optimize the order by:
>> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Analyzing path for
>> search: fn:doc()
>> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Step 1 is
>> searchable: fn:doc()
>> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Path is fully
>> searchable.
>> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Gathering
>> constraints.
>> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Search query
>> contributed 1 constraint: cts:word-query("bobh", ("lang=en"), 1)
>> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Executing search.
>> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Selected 3
>> fragments to filter.
>> >
>> > I've tried a number of other approaches to the path range index,
>> including paths that did not contain wildcards but instead included the
>> specific namespace possibilities:
>> > //(a:created|b:created|c:created)
>> > /(a:root|b:root|c:root)/(a:created|b:created|c:created)
>> >
>> > I've considered creating a field encompassing the three elements and a
>> field range index, but I don't know of a way to construct the order by
>> clause such that the field range index would be used.  Is that even
>> possible?
>> >
>> > Any suggestions are welcome!  If this sort of approach is a dead end,
>> we will need to wait until we have the opportunity to normalize the data to
>> optimize this.
>> >
>> > Thank you!
>> >
>> > -Bob
>> > _______________________________________________
>> > General mailing list
>> > [email protected]
>> > Manage your subscription at:
>> > http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to