Hi Ganesh,

If you want to use the Xerces-C abstractions to do the transcoding for you, you 
want to take a look at the XMLTransService class in 
xerces/util/TransService.cpp. The API is a little obtuse, so here's a code 
snippet for how to use it:

    char *data = "..."; // in UTF-8
    size_t length = strlen(data);

    XMLTransService::Codes failReason;
    const XMLSize_t blockSize = 2048;

    XMLTranscoder* trans = 
XMLPlatformUtils::fgTransService->makeNewTranscoderFor("utf-8", failReason, 
blockSize, fMemoryManager);
    Janitor<XMLTranscoder> janTrans(trans);

    XMLSize_t allocSize = length + 1;
    fString = (XMLCh*)fMemoryManager->allocate(allocSize * sizeof(XMLCh));

    XMLSize_t csSize = length;
    ArrayJanitor<unsigned char> charSizes((unsigned 
char*)fMemoryManager->allocate(csSize * sizeof(unsigned char)),
                                          fMemoryManager);

    XMLSize_t bytesRead = 0;
    XMLSize_t bytesDone = 0;

    while(true) {
        fCharsWritten += trans->transcodeFrom(in + bytesDone, length - 
bytesDone,
                                              fString + fCharsWritten, 
allocSize - fCharsWritten,
                                              bytesRead, charSizes.get());
        bytesDone += bytesRead;
        if(bytesDone == length) break;

        allocSize *= 2;
        XMLCh *newBuf = (XMLCh*)fMemoryManager->allocate(allocSize * 
sizeof(XMLCh));
        memcpy(newBuf, fString, fCharsWritten);
        fMemoryManager->deallocate(fString);
        fString = newBuf;

        if((allocSize - fCharsWritten) > csSize) {
            csSize = allocSize - fCharsWritten;
            charSizes.reset((unsigned char*)fMemoryManager->allocate(csSize * 
sizeof(unsigned char)),
                            fMemoryManager);
        }
    }

    // null terminate
    if(fCharsWritten == allocSize) {
        allocSize += 1;
        XMLCh *newBuf = (XMLCh*)fMemoryManager->allocate(allocSize * 
sizeof(XMLCh));
        memcpy(newBuf, fString, fCharsWritten);
        fMemoryManager->deallocate(fString);
        fString = newBuf;
    }
    fString[fCharsWritten] = 0;

If you were using a more recent version of Xerces-C, you could use the 
TranscodeToStr and TranscodeFromStr classes to both do the transcoding and 
manage the allocated memory, and lifetime of the resultant string. The code 
above is basically cut and paste from that class.

John

On 29 Nov 2010, at 13:45, Ganesh Pagade wrote:

> Hi,
> 
> I have a C++ string which is in UTF-8, which I am passing to
> XMLString::transcode(). However XMLString::transcode() is expecting SJIS
> (Japanese). So it returns garbage characters in XMLCh*.
> 
> So I tired using XMLUTF8Transcoder. However XMLUTF8Transcoder's transcodeTo()
> expects XMLCh*.
> 
> How do I convert C++ string into XMLCh* without using the
> XMLString::transcode(),
> so that I can pass it to transcodeTo()?
> 
> Or how do I make XMLString::transcode() expect UTF-8?
> 
> Any suggestions/pointers would be highly appreciated.
> 
> Thanks,
> -Ganesh

Reply via email to