Hello,

On 2023-08-29 20:29, Joseph Wright wrote:
On 29/08/2023 19:27, Heiko Oberdiek via luatex wrote:

using LuaTeX to review the glyphs of a font, I discovered an oddity about U+0387 ANO TELEIA. LuaTeX shows U+00B7 MIDDLE DOT instead.

         \symbol{"00B7}% MIDDLE DOT
         \symbol{"0387}% ANO TELEIA

 From UnicodeData.txt:

     0387;GREEK ANO TELEIA;Po;0;ON;00B7;;;;N;;;;;

so it looks like it's a simple normalisation.

Start of the UnicodeData.txt format description (https://www.unicode.org/reports/tr44/#UnicodeData.txt):
  [0] Code value
  [1] Character name
  [2] General category
  [3] Canonical combining classes
  [4] Bidirectional category
  [5] Character decomposition
  ...

In the LuaTeX manual, I found:

| Normalization of the Unicode input is on purpose not built-in and
| can be handled by a macro package during callback processing.
| We have made some practical choices and the user has to
| live with those.

The TeX input above, however, is plain ASCII. Therefore, any normalization of the file contents should not matter.

Of course, I do not want to have any decomposition that replaces
the glyph with a different character. That would make reviewing
the original glyph impossible.

Yours sincerely
  Heiko

Reply via email to