Dear Georg,

thank you for looking into this issue.

On 2016-01-03, Georg Baum wrote:
> Guenter Milde wrote:

>> Both, \r{A} and \AA (rsp. \r{a} and \aa) are equivalent standard LICR
>> macros
>> for Aring/aring  as well as the deprecated "angstrom sign" character
>> (212B).

>> However, with \AA for 212B and \r{A} for 00C5, tex2lyx converts \AA to the
>> deprecated "angstrom sign" which is missing in many fonts including the
>> Unicode version of Latin Modern.

>> See also
>> https://en.wikipedia.org/wiki/%C3%85#Symbol_for_.C3.A5ngstr.C3.B6m ---

> What we have here is a non-uniqueness in the unicode standard (U+00c5 vs. 
> U+212b), and in LaTeX as well (\AA vs \r{A}).

True.

There is one difference, though: while in LaTeX, \AA and \r{A} are both
valid and "current best praxis", the Unicode standard is clear that:

        * preferred representation is 00C5

> This double non-uniqueness cannot be fixed properly by just exchanging
> LaTeX macros in lib/unicodesymbols: 

Still, the current mappings in lib/unicodesymbols give the wrong indication,
of the unique mappings

   \r{A} <=> 00C5  (#Aring;)
   \AA   <=> 212B  (Angstrom sign)
   
while actually we have rather a relationship similar to

   \r{A} <=> 00C5 <= 212B
   \AA   <=> 00C5 <= 212B

IMO, it is important to have one common LICR for both, 00C5 and 212B
in lib/unicodesymbols. Whether this is \AA or \r{A} is of secondary
importance.   

> With the proposed patch, tex2lyx would translate \AA in the desired
> way, but it would not translate \r{A} in the desired way anymore.

True. 

However, with the proposed patch, \r{A} would translate to "0041 030A" 
(A + combining ring above), which is canonically equivalent to 00C5 and
exported to the correct "Aring" character in any LaTeX export route
except LuaTeX with system fonts.


> Since lib/unicodesymbols does not really support non-unique symbols, it
> is better to add special code for \AA and \aa to tex2lyx IMHO. Then it
> does not matter anymore whether U+00c5 maps to \AA or \r{A} 

Special code for \r{A} and \r{a} or \AA and \aa (currently not supported)
could/should be added in addition, not instead of a unique replacement for
00C5 and 212B in unicodesymbols.


> (but I would keep the current mapping, since it fits more nicely to the
> other symbols with rings).

Without special code for tex2lyx, \AA and \aa is the better
lib/unicodesymbols replacement, as there is a fallback for \r{A}, \r{a} but
not for \aa, \AA.

With the special code, it is a matter of taste - maybe best decided by a
native speaker of a Scandinavian language using the character.

I favour \AA \aa, as this is nearer to the Latin transliteration and the old
spelling in Danish.

Also, instead of special tex2lyx code just for \aa and \AA, we could use
NFC (or NFKC) normalization¹ after the "LyXification" - this would not
only map \r{A} to 00C5 (via 0041 030A) but also other similar cases of
non-unique LICRs for pre-composed accented characters.

¹ https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms


Thanks,

Günter

Reply via email to