John Crenshaw <johncrens...@priacta.com> wrote: > 2. MultiByteToWideChar supports a "MB_COMPOSITE" flag, which appears > to > give UTF-16 output.
MB_COMPOSITE has nothing to do with surrogate pairs, and everything to do with whether, say, Latin-1 character Á (A with accute) is converted to a single character U+00C1, or two characters U+0041 U+0301 (capital A + combining accute accent). The latter is "composite", the former is "precomposed". Do you believe _that's_ what differentiates UTF-16 and UCS-2? If so, you are mistaken. The difference between the two is in how Unicode characters U+10000 and up are represented (as surrogate pairs in one case, unsupported in the other). U+0041 U+0301 is a valid UCS-2 sequence and a valid UTF-16 sequence. > Microsoft never seems to clearly identify whether the wide APIs should > be given UTF-16 or UCS-2. You mean, which Unicode normalization form they expect ( see http://en.wikipedia.org/wiki/Unicode_equivalence ), which, again, has absolutely nothing to do with UTF-16 vs UCS-2. The answer is, Win32 API can handle any normalization form as well as denormalized strings. FoldString API is provided to normalize strings to various normalization forms if desired. Igor Tandetnik _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users