On Mon, Nov 01, 2021 at 11:41:06AM -0700, Toshio Kuratomi wrote:

> Unicode specifies the mapping of glyphs to code points.  Then a second
> mapping from code points to sequences of bytes is what is actually
> recorded by the computer.  The second mapping is what programmers
> using Python will commonly think of as the encoding while the majority
> of what you're writing about has more to do with the first mapping.

I don't think that is correct.

According to the Unicode consortium -- and I hope that they would know 
*wink* -- Unicode is the universal character encoding. In other words:

"Unicode provides a unique number for every character"

https://www.unicode.org/standard/WhatIsUnicode.html

Not glyphs.

("Character" in natural language is a bit of a fuzzy concept, so I think 
that Unicode here is referring to what their glossary calls an abstract 
character.)

The usual meaning of glyph is for the graphical images used 
by fonts (typefaces) for display. Sense 2 in the Unicode glossary here:

https://www.unicode.org/glossary/#glyph

I'm not really sure what they mean by sense 1, unless they mean a 
representative glyph, which is intended to stand in as an example of the 
entire range of glyphs.

Unicode does not specify what the glyphs for code points are, although 
it does provide representative samples. See, for example, their comment 
on emoji:

"The Unicode Consortium provides character code charts that show a 
representative glyph"

http://www.unicode.org/faq/emoji_dingbats.html

Their code point charts likewise show representative glyphs for other 
letters and symbols, not authoritative. And of course, many abstract 
characters do not have glyphs at all, e.g. invisible joiners, control 
characters, variation selectors, noncharacters, etc.

The mapping from bytes to code points and abstract characters is also 
part of Unicode. The UTF encodings are part of Unicode:

https://www.unicode.org/faq/utf_bom.html#gen2

The "U" in UTF literally stands for Unicode :-)


-- 
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I7ZRNIHSQ7UL4NSKOXFRYBYHQEXGNBPA/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to