Also do I need to use std::wstring to store UTF-8 strings or I will be
ok with std::string?

Thank you

On Fri, 2008-09-19 at 09:40 -0400, Anna Simbirtsev wrote:
> Hi,
> 
> Do you know if you can give me an example of how to transcode utf-8
> string to unicode and back? I think if I get the string in utf-8
> encoding, I need to convert it to unicode before I pass it into xerces
> parser?
> 
> On Wed, 2008-09-17 at 09:58 -0700, David Bertoni wrote:
> > Anna Simbirtsev wrote:
> > > When I print it in hex format, I get
> > > �: 0xffffffd0
> > > �: 0xffffffb1
> > > �: 0xffffffd0
> > > �: 0xffffffb1
> > > �: 0xffffffd0
> > > �: 0xffffffb1
> > > 
> > > Which I am not even sure what format, but maybe my shell does not
> > > know what it is.
> > You need to understand the limitations of any library you use.  Here is 
> > a snippet of the source code from the domtools library you're using:
> > 
> > string domtools::toString(const DOMString s)
> > {
> >     char * t = s.transcode();
> >     if (!t) return "";
> >     string tmp = t;
> >     delete [] t;
> >     return tmp;
> > }
> > 
> > You can see the call to DOMString::transcode().  This will fail when 
> > characters in the DOMString are not representable in the local code 
> > page.  This is likely what's happening, and I suggest you find another 
> > library to use, because this one is broken.
> > 
> > Alternately, if you always want to transcode data to UTF-8, you can 
> > modify the library to use a UTF-8 transcoder.  There was another thread 
> > late last week and this week on this topic.
> > 
> > Dave
> 

Reply via email to