Here's some feedback.
> Republic of > Korea (South Korea; simply "Korea" as follows) has set KS C 5601 in > 1989. They are both based upon JIS C 6226, could be one of the KS C 5601 was first issued in 1987 and revised in 1989 and 1992. Then, it was renamed and reissued as KS X 1001:1998 in 1998. > Though there are escape-based encodings for these two (ISO-2022-CN > and ISO-2022-KR, respectively), they are hardly used in favor of EUC. ISO-2022-KR used to be widely used for Korean email exchange as still is ISO-2022-JP. Now ISO-2022-KR is hardly used, but at least it was used widely until late 1990's. (see IETF RFC 1557). > When you say gb2312 and ksc5601, EUC-based encoding is assumed. Please, don't help spread this misuse. It might be all right for the ignorant) public to say KS C 5601 in place of EUC-KR, but Perl programmers should learn the difference between KS C 5601/KS X 1001 (coded character set) and encoding/MIME charset/character set encoding scheme/ character coding. As I wrote before, GB 2312 has been so widely (mis)used that there's no way to replace it with EUC-CN. Korean situation is much better although not as good as Japanese case. BTW, I don't find any reference to Microsoft code pages (CP949 for Korean, CP950, CP 936 , and CP932), JOHAB(Korean), and Big5-HKSCS Is that because they're not yet supported (well, Shift-JIS and Big5 are supported)? Another BTW, don't you think your description of Unicode and Han Unification is a bit too negative and biased? I know you feel strongly about the subject, but I'm not sure CJK-Guide is the best place to express your personal opinion on it in. If you don't like to tone down or change it, you may add a disclaimer like 'some people have reservation about Han Unification and Unicode because ......' or 'the following is my personal opinion shared by some people but not universally accepted'. > As a result, something funny has happed. For example, U+673A means "a > machine" in Simplified Chinese but "a desk" in Japanese. "a machine" > in Japanese. U+6A5F. Do you really believe this is a strong case against Han Unification? I don't see any problem with this. There are a number of Chinese characters with multiple meanings even without Han Unification. Do those 'meanings' have to be assigned separate code points? > So you can't tell what it means just by looking at the code. Why does coded character set have to care about what computational linguists have to do? You can't tell the meaning of any English word with multiple meanings by just looking at its computer representation without context/grammatical/linguistic/lexical analysis, can you? How do you know what 'fly' means without context? Jungshik Shin