Re: DBCS and Unicode 3.1

Jungshik Shin Wed, 19 Feb 2003 21:13:53 -0800

On Tue, 18 Feb 2003, Markus Scherer wrote:

> Jungshik Shin wrote:
> > On Mon, 17 Feb 2003, Markus Scherer wrote:
> >>Other examples: There are EUC-JP (1/2/3 bytes per character) and
> >>EUC-CN (1/2/4 BpC) which are quite  "old" (much older than GB 18030).
> >
> >   Markus's fingers made a mistake here :-). It's EUC-TW (not EUC-CN)
> > that encodes CNS 11643 plane 2(1) thru plane 7 using SS2.


> MBCS. By the way, the encoding scheme for EUC-TW has space for 16 CNS
> planes, and some vendor implementations use higher planes than 7.

  Yup. BTW, EUC-KR also uses more than 2 bytes. 8(eight) byte sequences
can be used to represent 8,822 precomposed modern Korean  syllables
not representable with 2 bytes in EUC-KR(ref.
KS X 1001:1998/KS C 5601-1987 annex 2). So, the full set
of 11,172 precomposed syllables in Unicode can be round-tripped
between Unicode and EUC-KR. This is used by the most popular
web mail service in Korea(well, they should switch to UTF-8
instead of lengthening the life of EUC-KR this way) and implemented
in Mozilla/Netscape and a variant of xterm for Korean(hanterm).

  Jungshik

Re: DBCS and Unicode 3.1

Reply via email to