Re: Problems with xerces-c version 1.7.0 and UTF-8

Anna Simbirtsev Fri, 19 Sep 2008 13:08:36 -0700

Thank you.
I think I can just take it out completely, since I want to keep it in
UTF-8 and just display to the user, not to convert to local code page.
And all I need a parser to do is parse a document that is in UTF-8 so it
should be ok.


On Fri, 2008-09-19 at 12:27 -0700, David Bertoni wrote:
> Anna Simbirtsev wrote:
> > Hi,
> > 
> > Do you know if you can give me an example of how to transcode utf-8
> > string to unicode and back? I think if I get the string in utf-8
> > encoding, I need to convert it to unicode before I pass it into xerces
> > parser?
> UTF-8 is an encoding of Unicode, so I'm not sure I understand your 
> question.  Xerces-C uses UTF-16 internally, so you would need to 
> transcode strings from UTF-8 to UTF-16 for APIs that expect arrays of 
> UTF-16 code units, such as DOMDocument::createElement(const XMLCh* 
> tagName). You can, however, parse UTF-8 documents without transcoding them.
> 
> There was a thread last week that discussed some of the issues with 
> local code page transcoding and you will find a link to an earlier 
> thread that has some transcoding code snippets.
> 
> Dave

Re: Problems with xerces-c version 1.7.0 and UTF-8

Reply via email to