Ah, I see now why I missed this. We're on ML7.x and an upgrade to ML8 right now isn't going to happen. So, while cts:index-order will very likely be the approach we use down the road, I've come up with an alternate, relevance-ranking-based, approach for now.
Thanks for the tip! -Bob On Sat, Oct 31, 2015 at 12:39 AM, Bob H. <[email protected]> wrote: > Doh! I even read the guide section on optimizing order-by and missed the > specifics of cts:index-order. I'll give that a shot. > > Thanks! > > -Bob > > On Fri, Oct 30, 2015 at 3:35 PM, Justin Makeig < > [email protected]> wrote: > >> You can be explicit with cts:index-order() < >> http://docs.marklogic.com/cts:index-order>. >> >> Justin >> >> >> > On Oct 30, 2015, at 12:29 PM, Bob H. <[email protected]> wrote: >> > >> > All, >> > >> > We know that ML will optimize an "order by" clause if it can with a >> range index. I've done this with element, attribute, and path range indexes >> on many occasions. However, I'm faced with a scenario for which I can't >> seem to find a straightforward optimization. Your help would be >> appreciated! >> > >> > Consider a database full of documents. The documents all contain an >> element with local name "created", but the namespaces vary. While >> normalization may be an option for the future, I'm trying to see if there >> is a way to avoid that for now. >> > >> > So, consider the following three documents: >> > <a:root xmlns:a="a"> bob >> <a:created>2015-10-28T00:00:00Z</a:created></a:root> >> > <b:root xmlns:b="b"> bob >> <b:created>2015-10-29T00:00:00Z</b:created></b:root> >> > <c:root xmlns:c="c"> bob >> <c:created>2015-10-30T00:00:00Z</c:created></c:root> >> > >> > I can define a path range index using the wildcarded path: >> > //*:created >> > >> > The path range index clearly works as expected, because I can retrieve >> all three values above via this cts:values statement: >> > cts:values(cts:path-reference('//*:created')) >> > >> > However, if I run a cts:search to find the three docs above and order >> the results by this path, query-trace shows that the sort is not optimized: >> > for $item in >> > cts:search(doc(), 'bob') >> > order by $item//*:created >> > return $item >> > >> > The query-trace output does not show that a range index was used to >> optimize the order by: >> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Analyzing path for >> search: fn:doc() >> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Step 1 is >> searchable: fn:doc() >> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Path is fully >> searchable. >> > 2015-10-30 15:26:01.909 Info: App-Services: at 4:11: Gathering >> constraints. >> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Search query >> contributed 1 constraint: cts:word-query("bobh", ("lang=en"), 1) >> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Executing search. >> > 2015-10-30 15:26:01.910 Info: App-Services: at 4:11: Selected 3 >> fragments to filter. >> > >> > I've tried a number of other approaches to the path range index, >> including paths that did not contain wildcards but instead included the >> specific namespace possibilities: >> > //(a:created|b:created|c:created) >> > /(a:root|b:root|c:root)/(a:created|b:created|c:created) >> > >> > I've considered creating a field encompassing the three elements and a >> field range index, but I don't know of a way to construct the order by >> clause such that the field range index would be used. Is that even >> possible? >> > >> > Any suggestions are welcome! If this sort of approach is a dead end, >> we will need to wait until we have the opportunity to normalize the data to >> optimize this. >> > >> > Thank you! >> > >> > -Bob >> > _______________________________________________ >> > General mailing list >> > [email protected] >> > Manage your subscription at: >> > http://developer.marklogic.com/mailman/listinfo/general >> >> >> _______________________________________________ >> General mailing list >> [email protected] >> Manage your subscription at: >> http://developer.marklogic.com/mailman/listinfo/general >> >> >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
