Stephen Collyer wrote:
I have a SAX2 parser which is exhibiting odd behaviour.
If I give it some XML with an XML declaration like:
<?xml version="1.0" encoding="UTF-8" ?>
it fails with a "Invalid document structure" error.
If I remove the encoding element, then it parses correctly.
This is quite strange, since the parser will assume the encoding is
UTF-8 without an encoding declaration. The only case where I could
imagine this might happen is with a UTF-16 document with an encoding
declaration that indicates a byte-oriented encoding. You can verify
this by looking at a binary dump of the XML stream.
Can anyone suggest what the problem is ? I'm assuming
that this is some interaction between the validator and
the encoding, but I'm baffled as to what, precisely.
Encoding detection and parsing happen at a lower level than validation.
That's also not an error from the validation code -- it's an
indication that the parser has found something wrong with the
fundamental structure of the XML document.
Can you post more details of what your code looks like, and how the
parser is configured? Also, if you can post a trivial document that
reproduces the error, that would help.
Dave