Yung-Fong Tang wrote: > ... But you > still need to know what U+4ff3a to define such mapping table, right?
Wrong. You just need to know the mapping between code points, whether assigned, used, or whatever. > ... So, whatever the software the user currently have today, without an > upgrade (either upgrade the code or mapping table) still won't know how to > convert U+4ff3a to lower case or upper case, right ? No, but that's irrelevant for character conversion. Once you update the Unicode character database in your product, your software will do it - if it knows how to deal with supplementary characters in general. (That part is a technicality which is, again, independent of whether there _are_ assigned characters.) > But how can you generate such mapping table without knowing that character ? By specifying which _code point_ in one encoding gets mapped to which other _code point_ in the other encoding. Character conversion never looks at whether the code points that it maps are actual _characters_. When you map between the GBK or Shift-JIS user-defined areas and Unicode PUA or similar, then you also map code points that don't have characters. What's new? > ... > How many years does it take for people to realize that give a new mappint to > their customer still need a complete life cycle of QA and distribution? And > there will be a new version number attach to the software for that. Is this about the existence of supplementary characters again? They exist since 1996, and a vendor who followed the UTC/ISO negotiations could see it coming since 1993. Surely most everyone had the time to roll out a new release of their software to get the support for them in - in more than five years? (I know that few actually worked on this in time. But time there was.) markus