Hi Dave(s),
First of all, regarding the point about changing the configuration of the
xercesc headers; unfortunately we do not control that part of the environment.
> For those with the L operator, then
> const XMLCh XMLUni::fgAnyString[] = { L'A', L'N', L'Y', L'\0' }
> const XMLCh XMLUni::fgAnyString[] = L"ANY";
As I understand it, the two things above may not generate the same results, as
the width of L is sometimes more than two bytes, hence the need (in my OP) of
the compile time flag -fshort-wchar
Likewise on GCC,
std::basic_string<XMLCh> my_string = L"the string that I wish to declare";
(ie, without a static cast ) will generate an error message: invalid
conversion from 'const wchar_t*' to 'const short unsigned int*'
And the example above
> const XMLCh XMLUni::fgAnyString[] = L"ANY";
generates the error "array must be initialized with a brace-enclosed
initializer" - which is understandable.
Not usign a basic_string construct still generates the same invalid conversion
error.
> const XMLCh* XMLUni::fgAnyString = L"ANY";
Produces the same effect (invalid conversion)
This is why I need to use a static cast as follows:
std::basic_string<XMLCh> my_string = (const XMLCh*)(L"the string that I wish
to declare");
Using preprocessor macros (yechh) I can tidy that up somewhat of course.
Dave (Bertoni), your question regarding if short-wchar guarantees UTF-16 code
points is a good one; albeit that we are using the short-wchar flag.
I was not aware that XercescC XMLCh implementation was UTF-16; I guess I
erroneously thought that it was UCS-2.
(The UCS-2 encoding form is identical to that of UTF-16, except that it does
not support surrogate pairs and therefore can only encode characters in the BMP
range U+0000 through U+FFFF. As a consequence it is a fixed-length encoding
that always encodes characters into a single 16-bit value.)
My string declarations only use characters that are in the UCS-2 / BMP range,
so I am not so concerned about the need to encode surrogate pairs as constants.
Regardless, the proposal of using the method in src/xercesc/util/XMLUni.cpp
does not support non BMP characters.
More to the point of your question though; regarding the GCC C++ flag
-fshort-wchar
http://gcc.gnu.org/onlinedocs/gcc-3.4.0/gcc/Code-Gen-Options.html#Code%20Gen%20Options
tells us this flag "overrides the underlying type for wchar_t to be short
unsigned int instead of the default for the target. This option is useful for
building programs to run under WINE."
What is salient to us is that IIRC (by default) XMLCh is defined to be a short
unsigned int also.
Therefore XMLCh == short unsigned int == wchar_t (when the -fshort-wchar flag
is used in GCC).
If this is the case then, as I understand it, using the static cast (const
XMLCh*)(L"the string that I wish to declare") should be perfectly fine.
Ben