Re: logQueryPlan readability

Army Thu, 05 Oct 2006 13:19:03 -0700

Bryan Pendleton wrote:


I was thinking that there might be some smaller, easier, things that we
could do which would offer some smaller wins, but might still be worth it.

Thanks for taking an interest in this discussion, Bryan, and for writing-up someideas!

I think that if these smaller wins also aid in the accomplishment of the biggergoal (i.e. emitting info as XML), then this is definitely a good approach.

Ideas that I had include:

 1) logQueryPlan output intermixes timing information ("we processed this
    many rows in this many milliseconds") with structural information
    ("we did a distinct scalar aggregate of a sort of a union"). I was
    thinking that we could refine the logQueryPlan behavior so that you
    could say things like:

     logQueryPlan=[nodeTree | estTime | actTime | estRows | actRows]

    so that the user could choose a subset of the information if they were
    only interested in the "shape" of the tree, say, and not the detailed
    performance numbers; the idea is just that we give the user a way to

pick a smaller subset of the information to give them a moreapproachable amount of output.

If this same kind of "subsetting" will also be used for the XML formatting, thensomething like this could be useful. In the particular breakdown shown in theexample, though, it seems like "nodeTree" could get to be pretty large. And ifwe're looking at time/row values for each node in the tree, would the valuesexist on top of the "nodeTree" structure? I.e. "estTime" is "nodeTree" withadditional info?

I think the big thing to figure out here is what "nodeTree" would be. It shouldtheoretically be small enough to only display what's "relevant". But what'srelevant for one query (or user) may not be relevant for another. So we'd haveto either 1) dump anything which could potentially be "relevant", or 2) allowthe person looking at the query plan to determine what s/he wants to look at.In the former case we end up back where we started with too large of a queryplan (though use of the "brief | full" tag below could help); in the latter,we'd have to come up with a way to allow the user to retrieve bits and pieces ofwhat is effectively an unstructured clob. The latter seems like it couldrequire a lot of effort that might be better spent working on an XML format,which then inherently has a means of allowing a user to retrieve the pieces s/hethinks are important (namley, via XPath).

Of course, as I write all of that I'm thinking of the several-thousand linequery plans that we get from queries like those in DERBY-1205 andDERBY-1777--which is probably not what the average person has in mind ;) Formore manageable queries I agree with you that just being able to show a simplenodeTree with minimal info would be a big improvement over what we have now.

 2) The indentation for the qualification information seems to get lost,
    in my experience, making the display hard to read. Also, since "real"
    query plans are often quite deep, I wonder whether we should display
    things with a fixed 1-or-2 space indentation "step" rather than using
    tabs. The advantage of emitting hard tabs is that the user can reset
    their tab stops, but if they don't, the lines almost always wrap or
    get truncated. Avoiding line wrap could also take the form of slightly
    less wordy displays, so that instead of

      optimizer estimated row count:   1439201.17

    we could say

      est rows: 1439201

    so that as the indentation started to grow, we'd still not wrap lines.
    Of course, this trades off line-wrapping for less-self-evident output.

This is an idea that I think would be useful if we were planning to stick withthe current (non-XML) format in the long term. But if the ultimate goal is toswitch to XML with some related analysis/viewing tool, I'm not sure what theeffect of the such a change would be? I guess if we go with the simplest ideaof using Derby's own SQL/XML operators to retrieve pieces of the query plan,then smaller indentation would indeed be useful (because Derby preserveswhitespace). And without knowing anything about where or how the query plansare actually written to logs, I would guess (hope) that this particularindentation change would be straightforward...

 3) There's a bunch of kind of detailed internal information in the query
    plan output, and in particular for many common queries we output a lot
    of information about settings which the user probably didn't set, but
    which are just set to their default value. I'm thinking about things
like "Ordered nulls: false" and "Unknown return value: false" andthe like. I wonder whether we could change the various outputroutines so that they only emitted information like this if it wasset to a non-default value, which seems like in the common case itmight make the query plan display substantially shorter. If we didn'twant to do this unconditionally, I thought that maybe we might givethe user a knob like:
      logQueryPlanFormat=[brief | full]

Seems like a good suggestion, and one that could be useful for an XML format, aswell. The only potential drawback is the addition of yet another Derbyproperty, but I don't know if that's really a concern or not?

Do people think that things like this would be worth doing? Or would it be
better just to bite the bullet and pursue the full-on "XML format and a
separate analysis tool" proposals?

Insofar as these kinds of things can still be useful for the eventual (andcurrently just theoretical) XML format, I say Yes, they would be useful.

I do wonder if we have to worry about "backward compatibility" of the queryplans? Are there people who may have written tools/utilities that expect thelog query plan to keep its current shape? If so, then we'd probably have to addanother property to toggle the new formatting on/off (with "off" being thedefault). And that would mean that we'd have to keep logic for generating twodifferent query plans around in the code. I'm not sure how difficult that wouldbe, nor whether use of a "full-on XML format" would make such logic better orworse...?

But all of that blabbering aside, if no one takes up the task of creating a fullXML format, then anyone willing to implement changes like the ones mentionedabove would still be doing query-plan-readers a favor.


So who's willing to fry what fish? :)

Army

Re: logQueryPlan readability

Reply via email to