Jesse Pelton <[EMAIL PROTECTED]> writes: > That assumes wchar_t holds UTF-16 (as XMLCh does). It might not. See > http://www.losingfight.com/blog/2006/07/28/wchar_t-unsafe-at-any-size/ > for a wchar_t story that would be amusing if it were fiction.
What most people fail to realize is that wchar_t holds whatever you put into it. If you want portable UTF-16 in wchar_t then put UTF-16 into it, even on platforms where wchar_t is 4-bytes long and can hold UTF-32. Alternatively, it is possible to use UTF-16 on platforms where wchar_t is 2-bytes long and UTF-32 on the rest. The only parts that will need to know about this arrangement are those that are responsible with converting to/from wchar_t strings (e.g., XMLCh to/from wchar_t). If the application does not need to do anything special with (e.g., search for) characters that are outside the BMP (Basic Multilingual Plane), then it can use wchar_t that contains either UTF-16 or UTF-32 without actually caring which one it is. And I am pretty sure this is 99.9% of applications. We use this approach in our XML data binding tool when the user requests the underlying character type to be wchar_t. Boris -- Boris Kolpackov Code Synthesis Tools CC http://www.codesynthesis.com Open-Source, Cross-Platform C++ XML Data Binding
