RE: UTF-8 encoding errors are not always detected

DeSmet_Ringo 20 Feb 2004 15:22:49 -0000

Maybe because the bad character is in the comment. I suspect the parser
skips everything until the closing comment tag. What happens when the bad
character is in an attribute value for example?


Ringo

-----Original Message-----
From: Berchner Matthias ICM Berlin
[mailto:[EMAIL PROTECTED]
Sent: vrijdag 20 februari 2004 15:15
To: '[EMAIL PROTECTED]'
Subject: UTF-8 encoding errors are not always detected


Hi,

I'm using Xerces 1.4.2, unfortunally  UTF-8 coding errors are not always
detected:

Example: 

--------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<Project>
        <!-- f�r ONC -->
</Project>
--------------------------------------------

<!-- f�r ONC --> correponds to 
        hex 3C 21 2D 2D 20 66 FC 72 20 4F 4E 43 20 2D 2D 3E

Non-UTF-8 character: � <-> FC   


Kind Regards,
Matthias 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: UTF-8 encoding errors are not always detected

Reply via email to