- Migrating from UCS-2 to UTF-16: Doable, and has been done for many applications and libraries.
- Difficult to handle UTF-16? Use ICU - it handles all of Unicode for collation, regular expressions, string casing, codepage conversion, and many other things.
- Support for supplementary characters only for Chinese? Japan has defined JIS X 0213 which has characters that map to + supplementary characters as well as + multiple BMP characters (ICU 2.8 will support codepage conversion involving multiple characters on either side)
CJKV ideographs, used in several languages, are driving support for supplementary characters.
- Case mappings can be modified to return a 32-bit Unicode code point instead of 16-bit BMP? This works, but only for "simple" case mappings. Full Unicode case mappings are defined on strings, and single-character APIs won't work at all. Full string mappings map 1:n and are context- and language-sensitive.
markus
http://oss.software.ibm.com/icu/
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.