Hi,

Apologies for reposting this. I'm hoping that this time it is readable.
I'm trying to transcode some data into UTF-8. The data I receive is not in 
UTF-16 format, so I can't use it as the input to the transcodeTo() function.

Please help me!

>-----Original Message-----
>From: David Bertoni [mailto:[email protected]]
>
>On 3/25/2010 12:03 AM, Swatilekha Doloi wrote:
>> Hi,
>>
>> Sorry for the delay in responding. My usage of the word 
>> 'non-printable' is probably incorrect. It displays something that 
>> looks like this:  æc¾Òw×s %# S1# ÔwÔi�...@õ OQ  S # õ”3
>>
>> Using XMLString::transcode before giving the buffer to the UTF-8 
>> Transcoder helped. This is my code now:
>OK, this is a very bad idea if your data is not in the machine's local
>code page. You need to provide more information about what the encoding 
>of the data in szCSTABuffer is.

I actually don't know - it comes from a different program running on a 
different computer. And yes, you're right, there's no guarantee that it would 
match my machine's locale settings. I would have to create a UTF16 transcoder 
and use 'transcodeTo' to convert the buffer to UTF-16. But I'm a bit confused 
with the various options:
fgUTF16BEncodingString                           'UTF-16 (BE)'
fgUTF16BEncodingString2                          'UTF-16BE'
fgUTF16EncodingString                            'UTF-16'
fgUTF16EncodingString2                           'UCS2'
fgUTF16EncodingString3                           'IBM1200'
fgUTF16EncodingString4                           'IBM-1200'
fgUTF16EncodingString5                           'UTF16'
fgUTF16EncodingString6                           'UCS-2'
fgUTF16EncodingString7                           'ISO-10646-UCS-2'
fgUTF16LEncodingString                           'UTF-16 (LE)'
fgUTF16LEncodingString2                          'UTF-16LE'

My target system is BE.
Should I use the ones for BE (fgUTF16BEncodingString/ fgUTF16BEncodingString2)? 
Or would these be fine (fgUTF16EncodingString/ fgUTF16EncodingString5)? 

Addendum: Also, I would like to know how to do this cascading transcode?
  some-encoding-->UTF-16BE--->UTF-8
After I transcode the buffer to UTF-16, the output is of type XMLByte. The 
Transcoder for UTF-8 expects XMLCh* and not XMLByte* as the input.

One last addition: transcodeTo for UTF-16 crashes sometimes. I don't know why 
this is happening. The call stack shows somewhere inside 
xercesc_2_8::XMLUTF16Transcoder::transcodeTo() a memcpy is crashing. This does 
not happen every time, though.

/** Transcode to UTF-8 */
 uiInLength             =       strlen(szCSTABuffer); 
 uiOutLength    =       uiInLength * UTF16_BYTES_PER_CHARACTER;
        //UTF16_BYTES_PER_CHARACTER is set to 4

/** Allocate memory for the output of the transcode operation*/ xmlInput = new 
XMLByte[uiOutLength + 1]; 

if(xmlInput)
{
        /** Transcode */
        uiTotalChars = m_pUTF16Transcoder->transcodeTo((const XMLCh*            
                                                                
const)szCSTABuffer,
                                                                        
uiInLength,
                                                                        
xmlInput,
                                                                        
uiOutLength,
                                                                        
uiCharsTranscoded,
                                                          
XMLTranscoder::UnRep_RepChar);

        xmlInput[uiTotalChars] = '\0'; 
}
What am I doing wrong? Is it the typecast to XMLCh* from char* when calling 
transcodeTo? Any other way to convert char* to XMLCh*? Please help!
>
>> /*******************************************************************/
>> if(szCSTABuffer)
>> {
>>
>>      /** Transcode the CSTA Buffer into XMLCh* */
>>      xmlInput = XMLString::transcode(szCSTABuffer);
>>
>>      uiInLength      =       XMLString::stringLen(xmlInput);
>>      uiOutLength     =       uiInLength * UTF8_BYTES_PER_CHARACTER;
>>      //UTF8_BYTES_PER_CHARACTER is set to 4
>>
>>      /** Allocate memory for the output of the transcode operation*/
>>      xmlTranscodedOutput = new XMLByte[uiOutLength + 1];
>>
>>
>>      if(xmlTranscodedOutput)
>>      {
>>              /** Transcode */
>>              // m_pUTF8Transcoder is of type  XMLTranscoder*
>>              uiTotalChars = m_pUTF8Transcoder->transcodeTo(
>>                                                 (const XMLCh* const)xmlInput,
>This cast is not necessary.
>
Thanks I will remove it.
>>                                                  uiInLength,
>>                                                  xmlTranscodedOutput,
>>                                                  uiOutLength,
>>                                                  uiCharsTranscoded,
>>                                                  
>> XMLTranscoder::UnRep_RepChar);
>>
>>              xmlTranscodedOutput[uiTotalChars] = '\0';
>>              XMLString::release(&xmlInput);
>>      }
>> }
>>              
>> Variables are defined as follows:
>> char*                szCSTABuffer            =       NULL;
>> XMLCh*               xmlInput                        =       NULL;
>> XMLByte*             xmlTranscodedOutput     =       NULL;
>> unsigned int uiInLength                      =       0;
>> unsigned int uiOutLength                     =       0;
>> unsigned int uiCharsTranscoded               =       0;
>> unsigned int uiTotalChars            =       0;
>> /*******************************************************************/

Reply via email to