Note to subscribers of Unicode and Linux-utf8 mailing list:
The following message is about two new characters added to South Korean (ROK) nat'l coded character set standard KS X 1001 in December, 1998. Although this change is not directly related to two lists I'm copying this to, I'm taking that step hoping that this 'news' will reach as many engineers in charge of supporting Korean at as many companies as possible. According to Bruno Haible (the maintainer of libiconv), this change doesn't seem to have been reflected on/in a number of platforms and products. It'd be nice if you could direct your question/reply to me as opposed to two lists. Thank you, On Sat, 20 Apr 2002, Bruno Haible wrote: Hi Bruno, Thank you for your reply. > Jungshik Shin writes: > > > I've just found that xc/lib/X11/lcUniConv/ksc5601.h had not been > > updated to reflect the change made in the standard at the end of 1998. > > Could you update it in both XF86 and libiconv? I thought you had already > > done that in libiconv because two characters had been added to glibc > > 2.2.x. They are: > > > > U+20AC at row 2, column 70 (0x2266) > > U+00AE at row 2, column 71 (0x2267) > > > > KSX1001.TXT.gz and JOHAB.TXT.gz at http://jshin.net/faq/ have been > > updated to reflect this change. > Thanks for telling me about problems in ksc5601.h, contained in both > libX11 and GNU libiconv. > > Can you give a little more evidence/details about the standard change that > you mention? The charmaps of EUC-KR on AIX, Solaris, Java don't > contain the change you mention > (see http://oss.software.ibm.com/cvs/icu/charset/data/ucm/), Ooops. Then, I have to report the change to Sun ( Solaris and JDK) and IBM (ICU and AIX). Well, as a short-cut, I'm copying this message to Unicode mailing list hoping that engineers from Sun, IBM, Oracle, Sybase, Apple and so forth will take a note.. If necessary, I'll contact them separately (A week or so ago, I wrote to a Sun engineer, but hasn't heard back yet.) Anyway, Perl::Encode already has them in both EUC-KR, JOHAB, CP949 and ISO-2022-KR. I filed a Mozilla bug (134749) and made a patch for this and Ken Lunde @Adobe was notified of the change so that CMap files for Adobe Korean fonts will have them, soon. On top of Solaris, JDK, ICU and AIX, various DBs(commerical or not), MacOS(apparently, Apple hasn't updated their Korean mapping), Python, and Tcl have to update their mapping tables as well. > and I can find no trace of such a change on various websites. All info > about these 2 character additions appears to originate from you. > Unfortunately I have learned that in this table patchwork business I > have to rely on several independent sources. The story goes like this. Sometime last fall, PARK Won-kyu <[EMAIL PROTECTED]> noticed that EUC-KR charmap in Glibc 2.2.x has two additional characters not found in my copy of KSX1001.TXT (at http://jshin.net/faq/KSX1001.TXT.gz). He asked me about them on [EMAIL PROTECTED] mailing list. I forwarded his message to Prof. GIM Geongseog (KIM Keyongseok) at Pusan Nat'l Univ. (<[EMAIL PROTECTED]>) who represents South Korea(ROK) in ISO/IEC JTC1 SC2/WG2 and SC22/WG20. (you can find several of his responses to North Korean requests for shuffling the codepoints of Korean Hangul syllables in ISO/IEC 10646 and Unicode per DPRK dictionary sorting order and adding conjoining Jamos to U+1100 block in JTC1 WG2/SC2 web page on behalf of ROK). He replied to me that indeed two characters were added to KS X 1001 in December 1998. He also mentioned that one more character (Korean zip code sign) would be added sometime this year. I can assure you that he is definitely an authorative source on KS X series standard. Another piece of 'evidence'(?) is that Windows-949 mapping table maintained by Microsoft (available at ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP949.TXT). It seems to have undergone a few revisions and the newest version include two characters at the corresponding EUC-KR positions. As you know well, CP949(Windows-949, Unified Hangul Code) is upward compatible with EUC-KR and has 8,822 additional Hangul syllables outside the EUC range (1st and 2nd byte 0xA1-0xFE). Therefore, subtracting 8,822 additional Hangul syllables from CP949<->Unicode mapping, we should get EUC-KR <-> Unicode mapping (I'm aware that the result is different from EUC-KR portion of MacKorean<->Unicode mapping which uses 'half-width' characters whenever possible.) Sure, Microsoft could have added other characters to some unused slots in the EUC-KR range as Apple did in MacOS Korean. However, if we trust Prof. Gim (,which I'm sure we can), that's not the case. To be 100% sure, I'd love to have a PDF version of KS X 1001:1998 with these two characters added. Unfortunately, <http://standard.ksa.or.kr> doesn't sell KS X 1001 in PDF. (I have a hard copy of KS C 5601-1992/ KS X 1001:1997.). Probably, I have to ask my friend in Seoul to buy a paper version of KS X 1001 and send it to me. Now it becomes interesting. Who added two characters to EUC-KR charmap in Glibc 2.2.x? I thought you had done that and I was 'ashamed of' not having noticed the change you had already found about. ;-) Apparently, you didn't. Then, Ulrich must have done it, right? I can't think of anyone else.... Anyway, I hope I presented enough evidence to convince you. Regards, Jungshik Shin