Re: [xquery-talk] Is it possible to maintain a list of value in XQuery

Michael Sokolov Wed, 15 May 2013 17:42:58 -0700

On 5/15/2013 6:25 PM, Michael Kay wrote:
[about the optimizer that it's]

making pure guesses based on observed behaviour rather than hard data- and by doing so, is reinforcing that behaviour. It's a black art.)

This is very insightful. We tend to think of the optimizer as"go-faster sauce," and often underestimate the impact that optimizershave, or should have, on program design, when performance is critical.

A familiar (to me) example of this is which indexes get built inpersistent data stores. MarkLogic, eg, builds automatic indexes on allelement names + words/values, and on all element/attribute name pairs +words/values, and these enable all kinds of optimizations. But theyaren't always the best choices. One thing we've had to grapple with iscustomers who use a particular attribute (id comes to mind) that canappear on any element, and be a target of cross-references. In thatcase, we'd really want an index on all attributes named "id", regardlessof the element name they're attached to. The ML indexes really doenforce a particular style of markup (if you want good performanceeasily). As another example, we tend to advise ML customers againsthaving the same-named element in different contexts since they aren't aseasily indexed. I don't mean to beat up on ML here which now offersXPath-based indexes, just like eXist! this is more in the way ofillustrating a broader point:

I wonder how much schema design has been / will be influenced by theavailability of various optimizations (and indexing options) in suchsystems, and to what extent these schemas will be more or less tuned tothe indexing options available on the platform where they were firstused. Has there ever been any sort of attempt to study which kinds ofindexes are most effective across some wide swath of use cases? I can'timagine how one would gather enough meaningful cases for that, soperhaps its a mere pipe dream. By the same token, has there been anyattempt to standardize the specification of XML indexes, as we have forSQL indexes? I guess we have the example of xsl:key -- that's reallythe only standard I know of.

To echo what Daniela said in an earlier message in this thread, I thinkthe key to helping users work with optimizers is to make it apparent tothe user what optimizations are being performed (if they ask), so theycan tell whether the optimizer is working for or against them, and toprovide tools for the user to specify particular optimizations, or toconstrain the optimizer, at least in critical decisions. There areprobably too many details to expose everything, but in particular in thecase of indexing optimizations, the correct (or incorrect) choice canhave such an overwhelming effect on performance that it is reallyimportant to give the user the ability to understand and control theexecution plan.

Query plans can often be opaque and difficult for all but the mostexpert users to understand, though. This has historically been true forSQL query plans, as well, although I think visualization tools cansometimes help. I like the approach of expressing all queryoptimizations as built-in functions. In this way, an optimized query isjust another query in the same language the user is familiar with,albeit with some special-purpose functions they have to learn in orderto understand what the optimizations are.


-Mike

_______________________________________________
[email protected]
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] Is it possible to maintain a list of value in XQuery

Reply via email to