Re: use of utf-8 with SAX

tbentley Fri, 22 Jun 2001 04:52:50 -0700

Not sure if this is accurate, but I thought some Asian languages could not be represent in UTF-8. Can someone confirm this? Is there a way to escape the problem character(s)?

Regards,
Thom Bentley
Iris Associates, 5 Technology Park Drive, Westford, MA 01886, 617-693-9210,

"KELLEHER,KEVIN (Non-HP-Roseville,ex1)" <[EMAIL PROTECTED]>

06/21/2001 05:58 PM
Please respond to xerces-c-dev

To: "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>
cc:
Subject: use of utf-8 with SAX

I am having some trouble with Asian-language data in the SAX parser. Specifically, some data that is originally in Taiwanese (roc15) is converted to utf-8 and embedded in an XML message. All the tags and attributes, etc. are in English, all the data is in Taiwanese. The problem occurs when I use the SAX parser to validate the message: it hits a piece of data that it interprets as end-of-data, and complains that it can't find the end tag that should follow the data. I get this error in versions 1.3 and 1.5, in my own code and when I run my data through the sample programs (i.e., SAXPrint, SAX2Print, SAXCount, etc.). Several people familiar with the language have confirmed the fitness of the data. My code is modeled after the SAXPrint example - is there anything missing there for processing Asian language data written in utf-8? Kevin Kelleher --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: use of utf-8 with SAX

Reply via email to