No, all the parser sees is a stream of bytes. It's up to the parser to
interpret the bytes properly. With no xml declaration, no encoding
provided, or no byte order mark, the parser assumes UTF-8. In that case,
your document is not XML, because it contains invalid characters.
Dave
Joseph
Shraibman To: [EMAIL PROTECTED]
<jks@selectac cc: (bcc: David N Bertoni/CAM/Lotus)
ast.net> Subject: Re: accented characters and
xerces j
11/05/2001
10:08 PM
Please
respond to
general
How can that be? Isn't unicode conversion done before any of the contents
are looked at?
[EMAIL PROTECTED] wrote:
> This is not the best list for Xerces questions. There is a Xerces-J list
> that you should subscribe to.
>
> The problem is that your document is encoded incorrectly. There is no
> ASCII character 246, since ASCII only defines characters up to 127.
> However, there _is_ a character defined in ISO-8859-1with such a value.
> Your document does not contain an XML declaration, so you need to add one
> and specify the correct encoding:
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
>
> Dave
>
--
Joseph Shraibman
[EMAIL PROTECTED]
Increase signal to noise ratio. http://www.targabot.com
---------------------------------------------------------------------
In case of troubles, e-mail: [EMAIL PROTECTED]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
In case of troubles, e-mail: [EMAIL PROTECTED]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]