Phillipe, instead of trying to sound authoritative by making up a whole-cloth
definition -- one that is completely and utterly wrong -- and thereby confuse
and mislead a beginner, you should either be silent or simply point the person
to the Unicode glossary:

http://www.unicode.org/glossary/#compatibility_character

Mark
__________________________________
http://www.macchiato.com
â ààààààààààààààààààààà â

----- Original Message ----- 
From: "Philippe Verdy" <[EMAIL PROTECTED]>
To: "Alexandre Arcouteil" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Fri, 2003 Nov 14 03:28
Subject: Re: compatibility characters (in XML context)


> ----- Original Message ----- 
> From: "Alexandre Arcouteil" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, November 14, 2003 10:41 AM
> Subject: compatibility characters (in XML context)
>
>
> > This is a beginner question :
> >
> > In the XML 1.1 Proposed Recommendation 05 November 2003
> > (http://www.w3.org/TR/xml11), it is said that "Document authors are
> > encouraged to avoid "compatibility characters", as defined in section
> > 6.8 of [Unicode]" so relating to Unicode 2.0.
> >
> > I don't see any online documentation about explicit definition of
> > "compatibility characters" according to 2.0.
>
> Compatibility characters can be defined as the characters whose canonical
> decomposition mapping is either::
>
>     (1) a singleton (example the AngstrÃm symbol, canonically mapped to A
> with diaeresis, or the list of unified Han ideographs, only included for
> compatibility with legacy charsets or because of assignment errors in
> Unicode 1.0) and that are implicitly restricted from being recomposed in all
> NF* forms, or
>
>     (2) two-code _canonical_ decomposition mapping, but are excluded from
> canonical composition (example the hebrew shin letter with shin dot).
>
> These characters will never be part of any string in a normalized form (NFC,
> NFD, NFKC, NFKD).
>
> > At least I'd like to know if characters like "Ã" "Ã" or "Å" are
> > concerned.
>
> No.: "Ã" and "Ã" have canonical decompositions, but are not excluded from
> recomposition.
> And the "oe ligature" has only a compatiblity decomposition, and then is not
> a compatibility character.
>
> > Is somewhere a complete chart of "compatibility characters" ?
>
>
> Look at the Unicode data file which lists composition exclusions...
>
>
>


Reply via email to