Stephen Collyer wrote:
I have a SAX2 parser which is exhibiting odd behaviour.

If I give it some XML with an XML declaration like:

<?xml version="1.0" encoding="UTF-8" ?>

it fails with a "Invalid document structure" error.
If I remove the encoding element, then it parses correctly.
This is quite strange, since the parser will assume the encoding is UTF-8 without an encoding declaration. The only case where I could imagine this might happen is with a UTF-16 document with an encoding declaration that indicates a byte-oriented encoding. You can verify this by looking at a binary dump of the XML stream.


Can anyone suggest what the problem is ? I'm assuming
that this is some interaction between the validator and
the encoding, but I'm baffled as to what, precisely.
Encoding detection and parsing happen at a lower level than validation. That's also not an error from the validation code -- it's an indication that the parser has found something wrong with the fundamental structure of the XML document.

Can you post more details of what your code looks like, and how the parser is configured? Also, if you can post a trivial document that reproduces the error, that would help.

Dave

Reply via email to