I'm still quite confuse with encoding problem.
In fact my problem is just to get a char array (a C style string: char[] or
char* ) from a XalanDOMString (on Linux Mandrake8.2 with Xalan-C with
gcc-2.96).

Should I transcode XalanDOMString to local code page to get char*???

I've got no problem with accent (like �����) in my char*, but it seems that
the transcode function of Xalan doesn't like this character and can't
transcode them (?? because it doesn't know the extended char table??).


During my experiences I've tried to copy directly one by one the Unicode
value of  each character in my XalanDOMString to a char array  (So an
unsigned short directly in a char...)
It's not a very good idea since XalanDOMString chararacters (unsigned short)
can use 16bits and char only 8bits.
I can limit this to value inferior to xFF (255) if I only use basic latin
and latin-1 supplement characters that don't use value on more than 8 bits
(x00 to xFF in hexadecimal) in the unicode enconding.
But the biggest problem is that I suppose that Unicode from 00 to FF match
the ANSI or ISO Latin1 encoding but I don't if it's right an which one is
used in char????

So can someone tell me what is the best way from XalanDOMString to char*?
and how to know what is the local code page encoding
and/or the extended encoding in my C char (ANSI or Iso Latin 1 ?? in my
case)

Thanks for your help
Vincent

Subject:  Re: accent
From:     "David N Bertoni/Cambridge/IBM" <[EMAIL PROTECTED]>
Date:     2002-04-11 15:45:28
>You can transcode to any encoding, but the transcode() call on
>XalanDOMstring transcodes to the local code page.  As I said before, if the
>local code page does not support that character, you cannot transcode the
>string to it.  If a code page cannot represent a character, there's no
>other solution.
>If you want to transcode to something else, like iso-8859-1, you'll need to
>get a transcoder for the encoding and transcode the string.  Whether or not
>your environment supports and can display that encoding, I don't know.  See
>the Xerces documentation for more information on transcoding, or search the
>source files for examples.

>Xalan has a collection of serializers that you can use if you want to
>serialize an entire document, or sub-tree, but it's overkill for simple
>string transcoding.

>Dave



Tell me if I'm wrong:
 XalanDOMString (the m_data) are UTF-16 string (in unsigned short on most
platforms...)



Reply via email to