Uhm.... 

The DTM document is not "always sorted". Our base implementation of DTM 
happens to number nodes in document order, but doesn't have to "sort" them 
because that happens to be the order in which the models are being built. 
(For SAX, that's because this is the order in which events arrive; for 
DOM, it's related to the difficulty of associating DOM nodes with DTM Node 
Handles except via a sequential scan.) There isn't a requirement that this 
ordering be used, as long as the implementation can find some way to 
answer the question of whether one node is before another when the XPath 
or XSLT process requires that information.

Some XPath and XSLT operations do require that data be presented in 
document order (or, in a few cases, reverse document order) even when that 
isn't necessarily the order in which it was generated. To achieve that, we 
may have to perform a sort operation. 

I agree that we _shouldn't_ be sorting if the underlying data asserts that 
it is already in the correct order, for obvoius reasons of efficiency. My 
understanding was that there is code to address just that optimization, 
but I haven't been able to find it in a quick scan of the iterators. (And 
of course if we are merging two sorted sets into another set sorted on the 
same principle, we ought to do an incremental merge-sort...)


So we need to take a closer look at your example to understand why it's 
concluding that an explicit sort must be performed in one case and not the 
other. I would have expected/hopeed that in both cases we would realize 
that the results can be benerated andreturned in document order.  The 
question is why we're failing to draw that conclusion in the longer 
path... and whether that is, in fact, what's happening, or if that path is 
instead causing read-ahead (and disruption of the SQL extension) by some 
mechanism other than sorting.

______________________________________
Joe Kesselman  / IBM Research

Reply via email to