Hi,
Sorry for the delay in responding. My usage of the word 'non-printable' is
probably incorrect. It displays something that looks like this:
æc¾Òw×s%#s1#ÔwÔi�...@õoqs#õ”3
Using XMLString::transcode before giving the buffer to the UTF-8 Transcoder
helped. This is my code now:
/*******************************************************************/
if(szCSTABuffer)
{
/** Transcode the CSTA Buffer into XMLCh* */
xmlInput = XMLString::transcode(szCSTABuffer);
uiInLength = XMLString::stringLen(xmlInput);
uiOutLength = uiInLength * UTF8_BYTES_PER_CHARACTER;
//UTF8_BYTES_PER_CHARACTER is set to 4
/** Allocate memory for the output of the transcode operation*/
xmlTranscodedOutput = new XMLByte[uiOutLength + 1];
if(xmlTranscodedOutput)
{
/** Transcode */
// m_pUTF8Transcoder is of type XMLTranscoder*
uiTotalChars = m_pUTF8Transcoder->transcodeTo(
(const XMLCh* const)xmlInput,
uiInLength,
xmlTranscodedOutput,
uiOutLength,
uiCharsTranscoded,
XMLTranscoder::UnRep_RepChar);
xmlTranscodedOutput[uiTotalChars] = '\0';
XMLString::release(&xmlInput);
}
}
Variables are defined as follows:
char* szCSTABuffer = NULL;
XMLCh* xmlInput = NULL;
XMLByte* xmlTranscodedOutput = NULL;
unsigned int uiInLength = 0;
unsigned int uiOutLength = 0;
unsigned int uiCharsTranscoded = 0;
unsigned int uiTotalChars = 0;
/*******************************************************************/
I was wondering, is there a way to optimise this?
Regards,
Swati
-----Original Message-----
From: John Lilley [mailto:[email protected]]
Sent: Tuesday, March 23, 2010 7:40 PM
To: [email protected]
Subject: RE: transcodeTo results in non-printable chars
What do you mean by "non-printable"? Not every program or console can display
all characters, and not every program will interpret UTF-8, and not every
system has the fonts installed to display all of the code points. I've found
that if you write UTF-8 to a text document and open it with MS Office Word
version 2003 or later it tends to display properly.
john
-----Original Message-----
From: Swatilekha Doloi [mailto:[email protected]]
Sent: Tuesday, March 23, 2010 7:43 AM
To: [email protected]
Subject: transcodeTo results in non-printable chars
Hi,
I'm trying to convert non-UTF 8 data to UTF-8. I'm creating a transcoder
to this. The output of the transcodeTo is a bunch of non-printable
characters.
I'm using Xerces 2.8 on Visual Studio 2005.
Any help will be much appreciated!
//--------------------------------------------------------------//
szBuffer = "Blah here!";
sizeBuffer = strlen(szBuffer);
XMLTransService::Codes failReason;
XMLRecognizer::Encodings encodingEnum =
XMLRecognizer::encodingForName(XMLUni::fgUTF8EncodingString);
XMLTranscoder* utf8Transcoder =
XMLPlatformUtils::fgTransService->makeNewTranscoderFor(
encodingEnum,
failReason,
16*1024);
int InLength = XMLString::stringLen(szBuffer);
int OutLength = InLength * 7;
XMLByte* res = new XMLByte[OutLength]; // output string
// Source string is in Unicode, want to transcode to UTF-8
unsigned int charsEaten = 0;
unsigned int total_chars = 0;
total_chars = utf8Transcoder->transcodeTo((const XMLCh*)szBuffer,
InLength,
res,
OutLength - 1,
charsEaten,
XMLTranscoder::UnRep_Throw);