shift_jis encoded xml

Shi Lu Fri, 08 Mar 2002 23:17:02 -0800

Hello, everybody,

I am parsing an xml file using DOMParser under the w2k
platform/shift_jis encoding. 
By using the apache version xerces-c1_6
library, I can't get the right string in Japanese.
So, I switched to IBM XML4C that has the ICU module or
sub set. Now, I could get the right japanese string
except that the size got truncated by half. That is,
if I have <name>abcdef</name> in xml file, what I got
is "abc" instead of "abcdef" where "abcdef" are some
japanese characters. However, if "abcdef" are all 
normal English letters, there was no problem.


The xml file is mostly normal. The head is like:
<?xml version="1.0" encoding="shift_jis"?>

And the code I used to get the string is like:

DOM_Node node;
CString strName = node.getNodeValue().transcode();


I felt the transcode() method lose some data. Did I
do something wrong or is there any work around? If I
run the sample program domprint.exe, both apache's 
version and ibm's version can print out the shift_jis
encoded xml file (contain japanese characters)
correctly. There are somethings in the transcode that 
I could do to make it right. Thanks for your help.

Ivan Lu

__________________________________________________
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

shift_jis encoded xml

Reply via email to