Thank you. I think I can just take it out completely, since I want to keep it in UTF-8 and just display to the user, not to convert to local code page. And all I need a parser to do is parse a document that is in UTF-8 so it should be ok.
On Fri, 2008-09-19 at 12:27 -0700, David Bertoni wrote: > Anna Simbirtsev wrote: > > Hi, > > > > Do you know if you can give me an example of how to transcode utf-8 > > string to unicode and back? I think if I get the string in utf-8 > > encoding, I need to convert it to unicode before I pass it into xerces > > parser? > UTF-8 is an encoding of Unicode, so I'm not sure I understand your > question. Xerces-C uses UTF-16 internally, so you would need to > transcode strings from UTF-8 to UTF-16 for APIs that expect arrays of > UTF-16 code units, such as DOMDocument::createElement(const XMLCh* > tagName). You can, however, parse UTF-8 documents without transcoding them. > > There was a thread last week that discussed some of the issues with > local code page transcoding and you will find a link to an earlier > thread that has some transcoding code snippets. > > Dave
