On Tue, 18 Feb 2003, Markus Scherer wrote: > Jungshik Shin wrote: > > On Mon, 17 Feb 2003, Markus Scherer wrote: > >>Other examples: There are EUC-JP (1/2/3 bytes per character) and > >>EUC-CN (1/2/4 BpC) which are quite "old" (much older than GB 18030). > > > > Markus's fingers made a mistake here :-). It's EUC-TW (not EUC-CN) > > that encodes CNS 11643 plane 2(1) thru plane 7 using SS2.
> MBCS. By the way, the encoding scheme for EUC-TW has space for 16 CNS > planes, and some vendor implementations use higher planes than 7. Yup. BTW, EUC-KR also uses more than 2 bytes. 8(eight) byte sequences can be used to represent 8,822 precomposed modern Korean syllables not representable with 2 bytes in EUC-KR(ref. KS X 1001:1998/KS C 5601-1987 annex 2). So, the full set of 11,172 precomposed syllables in Unicode can be round-tripped between Unicode and EUC-KR. This is used by the most popular web mail service in Korea(well, they should switch to UTF-8 instead of lengthening the life of EUC-KR this way) and implemented in Mozilla/Netscape and a variant of xterm for Korean(hanterm). Jungshik