David,

Thanks for your suggestion. That will probably work in every case except this one. The 
reason being we are building a wrapper library over Xerces and our interface exposes 
only std::wstring. We don't expose internal xerces types. In particular on Solaris, we 
want to link against STLport library. Is there any requirement in Xerces that will 
force XMLCh to be 2 bytes? If all xerces code uses sizeof(XMLCh) then it should be 
probably be ok, but if there is any hard coded value (which assumes 2 bytes), then the 
change won't work.


Qi Chen


-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 04, 2003 10:08 AM
To: [EMAIL PROTECTED]
Subject: Re: wchar_t and XMLCh






> Basically I need to convert the XMLCh* to a std::wstring and vice versa.
In Xerces, XMLCh is
> typdef-ed to unsigned short (2 bytes). Under win32, there is no need for
conversion since wchar_t
> is also typedef-ed to unsigned short. In Solaris/Linux/VMS, however,
wchar_t is typedef-ed to
> unsigned long (4 bytes), so the conversion seem to be inevitable.

There are several reason there's no need for conversion on Win32. One is
that Visual C++ 6.0 doesn't not implement wchar_t as a proper type, which
is not correct.  Most of the platforms to which you refer, depending on the
age of the compiler, _do_ implement wchar_t as a proper type, and not as a
typedef.  The other, and more important reason, is because Win32 uses
Unicode, so wide characters are known to be UCS-2/UTF-16 code points.

> My question is: Does Xerces implementation requires that the size XMLCh
to be 2 bytes?  if I
> change the typedef of XMLCh to wchar_t and recompile the xerces, would it
work? I know the
> answer is probably no, but I just want to make sure. Of course the memory
usage will be doubled
> if we change the XMLCh to 4 bytes, but that is not a concern for me.

For any given operating system, the issue is not really the size of XMLCh,
it's whether the operating system assumes wide characters are UCS-2/UTF-16
code points.  If not, there's no point in making XMLCh and wchar_t
compatible, because the OS cannot process them.

You should re-examine why you're storing UTF-16 encoded character, like
Xerces produces, in std::wstring.  std::basic_string<XMLCh> might be a
better choice.

Dave


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to