Hello,
On 2023-08-29 20:29, Joseph Wright wrote:
On 29/08/2023 19:27, Heiko Oberdiek via luatex wrote:
using LuaTeX to review the glyphs of a font, I discovered an oddity
about U+0387 ANO TELEIA. LuaTeX shows U+00B7 MIDDLE DOT instead.
\symbol{"00B7}% MIDDLE DOT
\symbol{"0387}% ANO TELEIA
From UnicodeData.txt:
0387;GREEK ANO TELEIA;Po;0;ON;00B7;;;;N;;;;;
so it looks like it's a simple normalisation.
Start of the UnicodeData.txt format description
(https://www.unicode.org/reports/tr44/#UnicodeData.txt):
[0] Code value
[1] Character name
[2] General category
[3] Canonical combining classes
[4] Bidirectional category
[5] Character decomposition
...
In the LuaTeX manual, I found:
| Normalization of the Unicode input is on purpose not built-in and
| can be handled by a macro package during callback processing.
| We have made some practical choices and the user has to
| live with those.
The TeX input above, however, is plain ASCII. Therefore, any
normalization of the file contents should not matter.
Of course, I do not want to have any decomposition that replaces
the glyph with a different character. That would make reviewing
the original glyph impossible.
Yours sincerely
Heiko