Hi Dave(s), 
First of all, regarding the point about changing the configuration of the 
xercesc headers; unfortunately we do not control that part of the environment.

> For those with the L operator, then
> const XMLCh XMLUni::fgAnyString[] = { L'A', L'N', L'Y', L'\0' }
> const XMLCh XMLUni::fgAnyString[] = L"ANY";

As I understand it, the two things above may not generate the same results, as 
the width of L is sometimes more than two bytes, hence the need (in my OP) of 
the compile time flag -fshort-wchar

Likewise on GCC, 
 std::basic_string<XMLCh> my_string = L"the string that I wish to declare";
(ie, without a static cast ) will generate an error message:  invalid 
conversion from 'const wchar_t*'  to 'const short unsigned int*'

And the example above 
> const XMLCh XMLUni::fgAnyString[] = L"ANY";

generates the error "array must be initialized with a brace-enclosed 
initializer" - which is understandable.

Not usign a basic_string construct still generates the same invalid conversion 
error.
> const XMLCh* XMLUni::fgAnyString = L"ANY";
Produces the same effect (invalid conversion)

This is why I need to use a static cast as follows:
std::basic_string<XMLCh> my_string =  (const XMLCh*)(L"the string that I wish 
to declare");
Using preprocessor macros (yechh) I can tidy that up somewhat of course.

Dave (Bertoni), your question regarding if short-wchar guarantees UTF-16 code 
points is a good one; albeit that we are using the short-wchar flag.

I was not aware that XercescC XMLCh implementation was UTF-16;  I guess I 
erroneously thought that it was UCS-2. 
(The UCS-2 encoding form is identical to that of UTF-16, except that it does 
not support surrogate pairs and therefore can only encode characters in the BMP 
range U+0000 through U+FFFF. As a consequence it is a fixed-length encoding 
that always encodes characters into a single 16-bit value.)

My string declarations only use characters that are in the UCS-2 / BMP range, 
so I am not so concerned about the need to encode surrogate pairs as constants. 
Regardless, the proposal of using the method in src/xercesc/util/XMLUni.cpp 
does not support non BMP characters.

More to the point of your question though; regarding the GCC C++ flag 
-fshort-wchar
http://gcc.gnu.org/onlinedocs/gcc-3.4.0/gcc/Code-Gen-Options.html#Code%20Gen%20Options
tells us this flag "overrides the underlying type for wchar_t to be short 
unsigned int instead of the default for the target. This option is useful for 
building programs to run under WINE."

What is salient to us is that IIRC (by default) XMLCh is defined to be a short 
unsigned int also.

Therefore XMLCh == short unsigned int == wchar_t  (when the -fshort-wchar flag 
is used in GCC).
If this is the case then, as I understand it, using the static cast (const 
XMLCh*)(L"the string that I wish to declare") should be perfectly fine.

Ben

Reply via email to