Peter Eisentraut wrote:
On mån, 2010-03-22 at 19:38 -0400, Andrew Dunstan wrote:
But if we are not comfortable about being able to do that safely, I
would be OK with just raising an error if a concatenation is
attempted
where one value contains a DTD.  The impact in practice should be
low.
Right. Can you find a way to do that using the libxml API? I haven't managed to, and I'm pretty sure I can construct XML that fails every simple string search test I can think of, either with a false negative
or a false positive.

The documentation on that is terse as usual.  In any case, you will need
to XML parse the input values, and so you might as well resort to
parsing the output value to see if it is well-formed, which should catch
this mistake and possibly others.


Actually, I have come to the conclusion that the biggest problem in this area is that we accept XML documents with a leading DOCTYPE node at all. Our docs state:

   The xml type can store well-formed "documents", as defined by the
   XML standard, as well as "content" fragments, which are defined by
   the production XMLDecl? content in the XML standard.

A document with a leading DOCTYPE node matches neither of these rules, and when we strip the XMLDecl from a piece of XML where it's followed by a DOCTYPE node we turn something that is legal XML into something that isn't, even by our own (or possibly the standard's) relaxed definition. A doctypedecl can only follow an XMLDecl, see <http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-prolog-dtd>.

So I think we need to go back to the drawing board a bit, rather than patch a particular reported error case. But these problems are not at all new to 9.0, and coming up to beta as I hope we are is not the time for it. I think it will have to wait to 9.1.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to