-----BEGIN PGP SIGNED MESSAGE----- Keld Jørn Simonsen wrote: > On Thu, Feb 14, 2002 at 03:57:34PM +0000, Juliusz Chroboczek wrote: > > MK> What we are trying to establish is the exact meaning that UNICODE > > MK> ought to have - that is, if it can have one at all. > > > > In the Unix-like world, the term ``UTF-8'' has been used quite > > consistently, and most documentation avoids using Unicode for a disk > > format (using it for the consortium, er., the Consortium, the > > character repertoire and, when useful, for the coded character set). > > > > The Unix-like public is used to thinking of UTF-8 as the format in > > which Unicode text is saved on disk, and ``UTF-8 (Unicode)'' or > > perhaps ``Unicode (UTF-8)'' should be the preferred user-interface > > item. > > I would rather recommend that you write ISO 10646 UTF-8 as the > ISO standard is a standard in many countries while Unicode is not.
But ISO 10646 is not the same as Unicode: - no character properties - no definition of normalisation or Hangul standard syllables - no BiDi algorithm - no collation algorithm - no algorithms for grapheme cluster, word, or line breaking - different treatment of requirement levels and subsetting - characters added at different times - differences in UTF validity conditions (hopefully to be fixed). In most cases, Unicode is meant, not ISO 10646. That's especially true when the text is NFC-normalised before saving, which is required in some cases (HTML, XML), and probably a good idea in many other cases. - -- David Hopwood <[EMAIL PROTECTED]> Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/ RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Nothing in this message is intended to be legally binding. If I revoke a public key but refuse to specify why, it is because the private key has been seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv iQEVAwUBPGt/4jkCAxeYt5gVAQFVKgf/becUlwinVnL+MZsVHSUwMzkrvRMOIotK ZaIqFR1Zc/vly9TM04RcLLYX9zFWRHdtC8W9dp/dMuljhxd+KSmd81HV0NAvpYLB QnlO4r+omWR9bPWwmArRqJbsFJrelZJhvD4LgLcDcsJgs87UvGbyX1RCupnAwoFO YQqMUuIta+Kfw2hqdqwY7Cifo8EKOOywzpmZZCoP4HZmYG/tPnXEz09W+abQNCVz qiuBpBqwUj7/I4C4+nm9A/Wv+uKPVRy/6zMZ37COZqs4jqqcTVXkvqx0GgMAgiTa xoTiukW7B+bT0hFVW/PQcoHw1v5vtuDQoo0E9vYZaSiUzErkaOkHKQ== =RsRJ -----END PGP SIGNATURE-----