Re: Problems with xerces-c version 1.7.0 and UTF-8

David Bertoni Tue, 16 Sep 2008 14:53:56 -0700

Anna Simbirtsev wrote:

I pass just plain xml string to the DOMParser, so I don't use the
transcode function.


 const void * const buffer = str.c_str();

   ::DOMParser parser;
   parser.setDoNamespaces(true);
   parser.setToCreateXMLDeclTypeNode(false);
   MemBufInputSource* memBufIS = new MemBufInputSource
     (
      (const XMLByte*)buffer
      , length
      , "domtools"
      , false
      );

   try {
      parser.parse(*memBufIS);
      DOM_Document doc = parser.getDocument();
      delete memBufIS;
      if (!doc.isNull()) return new XercesNode(doc);
   } catch(...) {
      delete memBufIS;
   };
   return new XercesNode();

When I had no ICU, it was returning an empty string instead of utf-8
string. I just copy utf-8 strings from wikipedia.org and paste it right
into the code to test. After I compiled the parser with ICU, it returns
the string, but shorter. My xml has UTF-8 encoding set: <?xml
version='1.0' encoding='UTF-8'?>.

You just posted the exact reply to this list that you posted to Jesse onthe developer list, but you've not included the necessary information sosomeone can help you.

There is nothing in the code snippet that you posted where you accessany data in the document, so I don't understand how you can tell anystrings are truncated.


Dave

Re: Problems with xerces-c version 1.7.0 and UTF-8

Reply via email to