Let me reiterate:
Unicode is ***NOT*** a glyph encoding!
Unicode is ***NOT*** a glyph encoding!
and never will be. The same character can be displayed as
a variety of glyphs, depending not only of the font/style,
but also, and this is the important point, on the characters
surrounding a particular instance of the character. Also,
a sequence of characters can be displayed as a single glyph,
and a character can be displayed as a sequence of glyphs.
Which will be the case, is often font dependent.
This is not something unique to Unicode. It is
just that most people are used to ASCII, Latin-1 and similar,
where the distinction between characters and glyphs is
blurred.
I would be interested in knowing why you think
"the idea of it as a character encoding thoroughly
breaks down in a mathematical context". Deciding
what gets encoded as a character is more an
international social process than a mathematical
process...
/kent k
PS This may be getting too much into Unicode
to fit for the Haskell list... In particular any argumentation
regarding the last paragraph above should *not* be sent to
the Haskell list, but could be sent to me personally.
PPS I don't know what you mean by "semantics of glyphs".
Hans Aberg wrote:
> I leave it to the experts to figure out what exactly Unicode is. I
> can
> only note that the idea of it as a character encoding thoroughly
> breaks
> down in a mathematical context. I think the safest thing is to only
> regard
> it as a set of glyphs, which are better, because ampler, than other
> encodings. I think figuring out the exact involved semantics of those
> glyphs is a highly complex issue which cannot fully be resolved.
>