On Mon, Nov 01, 2021 at 11:41:06AM -0700, Toshio Kuratomi wrote: > Unicode specifies the mapping of glyphs to code points. Then a second > mapping from code points to sequences of bytes is what is actually > recorded by the computer. The second mapping is what programmers > using Python will commonly think of as the encoding while the majority > of what you're writing about has more to do with the first mapping.
I don't think that is correct. According to the Unicode consortium -- and I hope that they would know *wink* -- Unicode is the universal character encoding. In other words: "Unicode provides a unique number for every character" https://www.unicode.org/standard/WhatIsUnicode.html Not glyphs. ("Character" in natural language is a bit of a fuzzy concept, so I think that Unicode here is referring to what their glossary calls an abstract character.) The usual meaning of glyph is for the graphical images used by fonts (typefaces) for display. Sense 2 in the Unicode glossary here: https://www.unicode.org/glossary/#glyph I'm not really sure what they mean by sense 1, unless they mean a representative glyph, which is intended to stand in as an example of the entire range of glyphs. Unicode does not specify what the glyphs for code points are, although it does provide representative samples. See, for example, their comment on emoji: "The Unicode Consortium provides character code charts that show a representative glyph" http://www.unicode.org/faq/emoji_dingbats.html Their code point charts likewise show representative glyphs for other letters and symbols, not authoritative. And of course, many abstract characters do not have glyphs at all, e.g. invisible joiners, control characters, variation selectors, noncharacters, etc. The mapping from bytes to code points and abstract characters is also part of Unicode. The UTF encodings are part of Unicode: https://www.unicode.org/faq/utf_bom.html#gen2 The "U" in UTF literally stands for Unicode :-) -- Steve _______________________________________________ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/I7ZRNIHSQ7UL4NSKOXFRYBYHQEXGNBPA/ Code of Conduct: http://python.org/psf/codeofconduct/
