Doug Ewell scripsit: > "When faced with [an] ill-formed code unit sequence while transforming > or interpreting text, a conformant process must treat the first code > unit... as an illegally terminated code unit sequence -- for example, by > signaling an error, filtering the code unit out, or representing the > code unit with a marker such as U+FFFD REPLACEMENT CHARACTER."
Plan 9, the original all-UTF-8 environment (it was translated in a single day from Latin-1 to UTF-8), represents ill-formed code unit sequences with the otherwise useless U+0080, on the grounds that an ill-formed code is semantically different from an untranslatable character, which is the purpose of U+FFFD. -- LEAR: Dost thou call me fool, boy? John Cowan FOOL: All thy other titles http://www.ccil.org/~cowan thou hast given away: [EMAIL PROTECTED] That thou wast born with. http://www.reutershealth.com