On 18/08/2003 09:06, Jim Allan wrote:

Jill Ramonsky posted:

I would really like it if these, and
every single other character which is "only there for reasons of round trip
compatibility" with something else, were explicity marked in the
machine-readable charts with something meaning "Don't introduce this
character, at all, ever. Don't try to interpret it. Just preserve it, in
case it ever gets turned back to its original character set".


That would probably be too strong.

If characters are available then some people will use them. :-(

See section 2.3 at http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf

Unicode 3.0 contained under section D21 on compatibility characters:

<< Their use is discouraged other than for legacy data. >>

I don't know whether this statement was intentionally removed was accidently dropped in the changes in 4.0 which distinguish "compatitiblity character" from "compatibility composite character".

In any case people can't be prevent from doing things that are officially discouraged, especially as for some particular use it might be wrong to discourage them. So if you are handling Roman numerals in an application and wish your handling to be complete then unfortunately you do have to take the compatibility Roman numerals into account.

Yes, but people can be clearly discouraged from using them, and that is not currently happening. It seems that currently if you come across a character by browsing through the charts and want to discover if use of it is officially discouraged you have to wade through huge databases and hundreds of pages of text to find out if a particular set of properties implies that use is discouraged. Well, even that won't tell me definitively, for I read, "The compatibility decomposable characters are precisely defined in the Unicode Character Database, whereas the compatibility characters in the more inclusive sense are not." (from section 2.3) - and it is the latter whose use is discouraged. But is it in fact safe to assume that the list of such characters includes, but is not limited to, those which have defined compatibility mappings?


It would be much simpler if each such character were clearly labelled in the code charts etc. DO NOT USE!, and with its glyph presented on a grey background or in some other way to indicate its special status.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to