Hello:

So, is there an easy mechanism to identify characters that are being
transformed to '?' 

Thanks,
-RK

-----Original Message-----
From: Christopher Ebert [mailto:[EMAIL PROTECTED] 
Sent: Monday, September 22, 2003 4:39 PM
To: [EMAIL PROTECTED]
Subject: RE: invalid encoding character

        Hi,

>>Is there any way i can catch this using xalan.( iam using xpath
>>functionality of xalan so just want to know whether i can extend any of
>>xalan features for spotting invalid encoding char  ) 
        By the time the file has been parsed, it's been converted to a Java
string, which no longer has the encoding of the original file (it's UTF-16).
You need to address this in your original file, either by setting the
encoding to what it really is (if you have characters outside of ISO-8859-1
then it's not really encoded as ISO-8859-1) or by encoding the non-ISO
characters as unicode entities.

>Alternatively, you should be able to use the getBytes("ISO-8859-1")
>method on a string in java that will throw an
>UnsupportedEncodingException and catch that.

        I don't think this does quite what you expect.
UnsupportedEncodingException will be thrown if the requested encoding is
unknown. Characters that are not supported by a known encoding turn into
"?".


        Chris


This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to which they are addressed.
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and delete
this e-mail from your system. If you are not the intended recipient you are
notified that disclosing, copying, forwarding or otherwise distributing or
taking any action in reliance on the contents of this information is
strictly prohibited. 


Reply via email to