On 3/25/2010 12:03 AM, Swatilekha Doloi wrote:
Hi,
Sorry for the delay in responding. My usage of the word 'non-printable' is
probably incorrect. It displays something that looks like this:
æc¾Òw×s%#s1#ÔwÔi�...@õoqs#õ”3
Using XMLString::transcode before giving the buffer to the UTF-8 Transcoder
helped. This is my code now:
OK, this is a very bad idea if your data is not in the machine's local
code page. You need to provide more information about what the encoding
of the data in szCSTABuffer is.
/*******************************************************************/
if(szCSTABuffer)
{
/** Transcode the CSTA Buffer into XMLCh* */
xmlInput = XMLString::transcode(szCSTABuffer);
uiInLength = XMLString::stringLen(xmlInput);
uiOutLength = uiInLength * UTF8_BYTES_PER_CHARACTER;
//UTF8_BYTES_PER_CHARACTER is set to 4
/** Allocate memory for the output of the transcode operation*/
xmlTranscodedOutput = new XMLByte[uiOutLength + 1];
if(xmlTranscodedOutput)
{
/** Transcode */
// m_pUTF8Transcoder is of type XMLTranscoder*
uiTotalChars = m_pUTF8Transcoder->transcodeTo(
(const XMLCh* const)xmlInput,
This cast is not necessary.
uiInLength,
xmlTranscodedOutput,
uiOutLength,
uiCharsTranscoded,
XMLTranscoder::UnRep_RepChar);
xmlTranscodedOutput[uiTotalChars] = '\0';
XMLString::release(&xmlInput);
}
}
Variables are defined as follows:
char* szCSTABuffer = NULL;
XMLCh* xmlInput = NULL;
XMLByte* xmlTranscodedOutput = NULL;
unsigned int uiInLength = 0;
unsigned int uiOutLength = 0;
unsigned int uiCharsTranscoded = 0;
unsigned int uiTotalChars = 0;
/*******************************************************************/
I was wondering, is there a way to optimise this?
Without more information, it's hard to say what you should be doing.
However, my guess is you're trying to transcode from a single byte, or
variable byte encoding to UTF-8.
The proper way to do this in Xerces-C is by pivoting through UTF-16.
You're almost there, but you're transcoding to UTF-16 through
XMLString::transcode(), which is only correct if your szCSTABuffer is
encoded in the local code page. You may need to create an explicit
transcoder for the encoding in szCSTABuffer, use that transcoder to get
to UTF-116, then use a UTF8 transcoder to get from UTF-16 to UTF-8.
Dave