Unicode FAQ addendum

John Cowan Wed, 19 Jul 2000 10:13:04 -0700
The new Unicode FAQ (like the old) supplies the panting world with
John's Own Version of Unicode Conformance:

1) Unicode code units are 16 bits long; deal with it.
2) Byte order is only an issue in files.
3) If you don't have a clue, assume big-endian.
4) Loose surrogates don't mean jack.
5) Neither do U+FFFE and U+FFFF.
6) Leave the unassigned codepoints alone.
7) It's OK to be ignorant about a character, but not plain wrong.
8) Subsets are strictly up to you.
9) Canonical equivalence matters.
10) Don't garble what you don't understand.

But for 3.0 I will add:

11) Process UTF-* by the book.
12) Treat bogus encodings as junk.
13) Right-to-left scripts have to go by bidi rules.

These conformance sentences match up one-for-one with the conformance
clauses in Chapter 3 (TUS3.0, pp. 37-39).

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <[EMAIL PROTECTED]>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)
Unicode FAQ addendum

Reply via email to