Hi, Apologies for reposting this. I'm hoping that this time it is readable. I'm trying to transcode some data into UTF-8. The data I receive is not in UTF-16 format, so I can't use it as the input to the transcodeTo() function.
Please help me! >-----Original Message----- >From: David Bertoni [mailto:[email protected]] > >On 3/25/2010 12:03 AM, Swatilekha Doloi wrote: >> Hi, >> >> Sorry for the delay in responding. My usage of the word >> 'non-printable' is probably incorrect. It displays something that >> looks like this: æc¾Òw×s %# S1# ÔwÔi�...@õ OQ S # õ”3 >> >> Using XMLString::transcode before giving the buffer to the UTF-8 >> Transcoder helped. This is my code now: >OK, this is a very bad idea if your data is not in the machine's local >code page. You need to provide more information about what the encoding >of the data in szCSTABuffer is. I actually don't know - it comes from a different program running on a different computer. And yes, you're right, there's no guarantee that it would match my machine's locale settings. I would have to create a UTF16 transcoder and use 'transcodeTo' to convert the buffer to UTF-16. But I'm a bit confused with the various options: fgUTF16BEncodingString 'UTF-16 (BE)' fgUTF16BEncodingString2 'UTF-16BE' fgUTF16EncodingString 'UTF-16' fgUTF16EncodingString2 'UCS2' fgUTF16EncodingString3 'IBM1200' fgUTF16EncodingString4 'IBM-1200' fgUTF16EncodingString5 'UTF16' fgUTF16EncodingString6 'UCS-2' fgUTF16EncodingString7 'ISO-10646-UCS-2' fgUTF16LEncodingString 'UTF-16 (LE)' fgUTF16LEncodingString2 'UTF-16LE' My target system is BE. Should I use the ones for BE (fgUTF16BEncodingString/ fgUTF16BEncodingString2)? Or would these be fine (fgUTF16EncodingString/ fgUTF16EncodingString5)? Addendum: Also, I would like to know how to do this cascading transcode? some-encoding-->UTF-16BE--->UTF-8 After I transcode the buffer to UTF-16, the output is of type XMLByte. The Transcoder for UTF-8 expects XMLCh* and not XMLByte* as the input. One last addition: transcodeTo for UTF-16 crashes sometimes. I don't know why this is happening. The call stack shows somewhere inside xercesc_2_8::XMLUTF16Transcoder::transcodeTo() a memcpy is crashing. This does not happen every time, though. /** Transcode to UTF-8 */ uiInLength = strlen(szCSTABuffer); uiOutLength = uiInLength * UTF16_BYTES_PER_CHARACTER; //UTF16_BYTES_PER_CHARACTER is set to 4 /** Allocate memory for the output of the transcode operation*/ xmlInput = new XMLByte[uiOutLength + 1]; if(xmlInput) { /** Transcode */ uiTotalChars = m_pUTF16Transcoder->transcodeTo((const XMLCh* const)szCSTABuffer, uiInLength, xmlInput, uiOutLength, uiCharsTranscoded, XMLTranscoder::UnRep_RepChar); xmlInput[uiTotalChars] = '\0'; } What am I doing wrong? Is it the typecast to XMLCh* from char* when calling transcodeTo? Any other way to convert char* to XMLCh*? Please help! > >> /*******************************************************************/ >> if(szCSTABuffer) >> { >> >> /** Transcode the CSTA Buffer into XMLCh* */ >> xmlInput = XMLString::transcode(szCSTABuffer); >> >> uiInLength = XMLString::stringLen(xmlInput); >> uiOutLength = uiInLength * UTF8_BYTES_PER_CHARACTER; >> //UTF8_BYTES_PER_CHARACTER is set to 4 >> >> /** Allocate memory for the output of the transcode operation*/ >> xmlTranscodedOutput = new XMLByte[uiOutLength + 1]; >> >> >> if(xmlTranscodedOutput) >> { >> /** Transcode */ >> // m_pUTF8Transcoder is of type XMLTranscoder* >> uiTotalChars = m_pUTF8Transcoder->transcodeTo( >> (const XMLCh* const)xmlInput, >This cast is not necessary. > Thanks I will remove it. >> uiInLength, >> xmlTranscodedOutput, >> uiOutLength, >> uiCharsTranscoded, >> >> XMLTranscoder::UnRep_RepChar); >> >> xmlTranscodedOutput[uiTotalChars] = '\0'; >> XMLString::release(&xmlInput); >> } >> } >> >> Variables are defined as follows: >> char* szCSTABuffer = NULL; >> XMLCh* xmlInput = NULL; >> XMLByte* xmlTranscodedOutput = NULL; >> unsigned int uiInLength = 0; >> unsigned int uiOutLength = 0; >> unsigned int uiCharsTranscoded = 0; >> unsigned int uiTotalChars = 0; >> /*******************************************************************/
