Guenter Milde wrote:
> I added the mapping for the "macron below" after research into combining
> diacritical characters in connection with the comma below fix.
>
>> Looking further at change 268bd00, I see that both 0x0320 and 0x0331 do
>> now map to \b in lib/unicodesymbols. Which one is correct?
>
> * \b for "combining macron below" is definitely not wrong:
>
> - both produce an underline that does not connect with neighbours (as
> opposed to "line below") https://en.wikipedia.org/wiki/Macron_below
>
> The German version explicitely mentions the TeX representation \b
> https://de.wikipedia.org/wiki/Makron_%28Unterzeichen%29
>
> - \b is described as "underbar accent"
> (http://www.ams.org/membership/texcodes)
> or "macron below (line below)" (http://vjimc.osu.cz/TeXform.html)
>
> - Unicode has a number of precomposed characters with "macron below",
> e.g.
> 1E06 LATIN CAPITAL LETTER B WITH LINE BELOW
> : 0042 0331
> (mark the canonical decomposition despite the different name!)
>
> These characters can be reliably expressed as LICRs with \b.
> Compare the output of "Ḇḇ, Ḏḏ" vs. "\b{B}\b{b}, \b{D}\b{d}" in a XeTeX
> document (see below).
This is convincing. I also found that 0x02cd MODIFIER LETTER LOW MACRON is
already mapped to \b{ }, so this would be consistent as well.
Please update the tex2lyx test references, which would make change 268bd00
complete. If you don't know how to do that, please have a look at
lib/doc/Development.lyx, section 3.2.2.
> * \b for combining minus sign below may be correct (but I did not add
> it!):
>
> - the minus below is used in the phonetic transcription for "retracted"
>
https://www.internationalphoneticassociation.org/sites/default/files/phonsymbol.pdf
>
https://en.wikipedia.org/wiki/Relative_articulation#Advanced_and_retracted
>
> - The output is identic in the XeTeX example below (this may differ
> when a different font is selected).
>
> - The Unicode standard says:
>
> COMBINING MINUS SIGN BELOW
> • IPA: retracted or backed articulation
> • glyph may have small end-serifs
>
> -- http://unicode.org/charts/PDF/U0300.pdf
>
> while \b does not produce small end-serifs.
>
> - The tipa-manual says:
>
> Tiefgestellter Balken Usage: rückverlagert
> Input1 : \textsubbar{e} Input2 : \=*e
> Sources: IPA ’49–’96
>
> i.e. the equivalent to "combining minus sign below" according to tipa
> would be the "\textsubbar" accent macro.
>
>
> There are many cases, where Unicode has code points that map to the same
> LICR. Sometimes Unicode itself "merged" the code points
> (e.g. ` 1FEF GREEK VARIA == ` 0060 GRAVE ACCENT)
But the semantic information is different. In this particular case, this is
considered by using \textgreek{`v} for GREEK VARIA.
> sometimes different Unicode points have the same LaTeX counterpart:
> (\~ is both, accent tilde and accent perispomeni but Unicode keeps the
> difference).
OK for cases like this where LaTeX is not as expressive as unicode I agree
that we need non-unique mappings.
Thanks for the detailed analysis! To me the information you gathered about
0x0320 indicates that \textsubbar should be used instead of \b.
Georg