Robert Haas wrote:
The one significant representational choice that I'm aware of having
made is to use nested tags rather than attributes in the XML format.
This seems to me to offer several advantages.  First, it's clearly
impossible to standardize on attributes, because attributes can only
be text, and it seems to me that if we're going to try to output
structured data, we want to take that as far as we can, and we have
attributes (like sort keys) that are lists rather than scalars.  Using
tags means that they can have substructure when needed.  Second, it
seems likely to me that people will want to extend explain further in
the future: indeed, that was the whole point of the explain-options
patch which was already committed.  That's pretty simple in the
current design - just add a few more calls to ExplainPropertyText or
ExplainPropertyList in the appropriate place, and you're done.  I'm
pretty sure that splitting things up between attributes and nested
tags would complicate such modifications.


In general, in XML one uses an attribute for a named property of an object that can only have one value at a time. A classic example is the dimensions of an object - it can only have one width and height. Children (nested tags, particularly) are used for things it can have an arbitrary number of, or things which in turn can have children. the HTML <p> and <body> elements are (respectively) examples of these. Generally, attribute values especially should be short - I recently saw an example that had an entire image hex encoded in an XML attribute, which struck me as just horrible. Enumerations, date and time values, booleans, measurements - these are common types of attribute values. Extracting a value from an attribute is no more or less difficult than from a nested tag, using the XPath query language.

The XML Schema standard is a language for specifying the structure of a given XML document type, and while it is undoubtedly complex, it is also much more powerful than the older DTD mechanism. I think we should be creating (and publishing) an XML Schema specification for any XML documents we are producing. There are a number of members of the community who are equipped to help produce these.

There is probably a good case for using an explicit namespace with such docs. So we might have something like:

   <pg:explain
   xmlns:pg="http://www.postgresql.org/xmlspecs/explain/v1.xsd";> ....

BTW, has anyone tried validating the XML at all? I just looked very briefly at the patch at <http://archives.postgresql.org/pgsql-hackers/2009-07/msg01944.php> and I noticed this which makes me suspicious:

+       if (es.format == EXPLAIN_FORMAT_XML)
+               appendStringInfoString(es.str,
+                       "<explain xmlns=\"http://www.postgresql.org/2009/explain\"; 
<http://www.postgresql.org/2009/explain%5C%22>;>\n");


That ";" after the attribute is almost certainly wrong. This is a classic case 
of what I was talking about a month or two ago. Building up XML (or any structured doc, 
really, XML is not special in this regard) by ad hoc methods is horribly error prone. if 
you don't want to rely on libxml, then I think you need to develop a lightweight 
abstraction rather than just appending to a StringInfo.





cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to