On 11/23/2011 2:38 AM, Jeremie Hornus wrote:

On 23 Nov 2011, at 00:21, Asmus Freytag wrote:

On 11/22/2011 1:22 PM, Jeremie Hornus wrote:

Wouldn't be "Unicode Character Glyph Description" more accurate than "Unicode Character Name" ?
And just "Unicode Character Description" for those pointing to no glyph.

These are "names" in the sense of an ID. That they are created by deriving them from a description of the characters appearance in many cases does not alter that fact.


I was thinking the ID being the code point value itself, and the "name" a human readable description of it.

J.


Jeremie,

what matters is how the standard happens to define these things. And the standard is clear about the name being an *identifier*, not a "description".

Your question indicates that you are wondering why one would need a human readable identifier. Well, for one, the use of raw hex codes (even if spelled out as sequences of four to six alphanumeric characters) is very error prone. The hex digit "8" gets very often confused with the hex digits "B" (visual confusion) or "A" (they sound alike in English). And second, if you just see a code 2329 for example, only a small number of people (even on this list) have an idea what character is hiding behind it, without looking it up. So, they wouldn't know whether to accept that character or whether it represents a mistake and should have been 27E9 instead? Or was that 27E8?

With the names in front of you, it's an easy matter to decide which substitution would be the correct one, so human readable IDs are helpful.

For the purpose of an ID, names must be unique. Where possible, they should help identify a character(*). There's no requirement that they are based on the "best" name for a character, nor that they follow any specific single prescription for their construction, or satisfy any arbitrary condition of correctness. Where names, by accident, have turned out to be actively misleading, alternate IDs (formal aliases) have been created (check the UCD).

A./

(*) For Han ideographs, it's not been possible to come up with a reasonable scheme for human readable IDs, so here the names are based on the character code.

Reply via email to