Le dim. 10 févr. 2019 à 16:42, James Kass via Unicode <[email protected]> a écrit :
> > Philippe Verdy wrote, > > >> ...[one font file having both italic and roman]... > > The only case where it happens in real fonts is for the mapping of > > Mathematical Symbols which have a distinct encoding for some > > variants ... > > William Overington made a proof-of-concept font using the VS14 character > to access the italic glyphs which were, of course, in the same real > font. Which means that the developer of a font such as Deja Vu Math TeX > Gyre could set up an OpenType table mapping the Basic Latin in the font > to the italic math letter glyphs in the same font using the VS14 > characters. Such a font would work interoperably on modern systems. > Such a font would display italic letters both if encoded as math > alphanumerics or if encoded as ASCII plus VS14. Significantly, the > display would be identical. > > > ...[math alphanumerics]... > > These were allowed in Unicode because of their specific contextual > > use as distinctive symbols from known standards, and not for general > > use in human languages > > They were encoded for interoperability and round-tripping because they > existed in character sets such as STIX. They remain Latin letter form > variants. If they had been encoded as the variant forms which > constitute their essential identity it would have broken the character > vs. glyph encoding model of that era. Arguing that they must not be > used other than for scientific purposes is just so much semantic > quibbling in order to justify their encoding. > > Suppose we started using the double struck ASCII variants on this list > in order to note Unicode character numbers such as 𝕌+𝔽𝔼𝔽𝔽 or > 𝕌+𝟚𝟘𝟞𝟘? Hexadecimal notation is certainly math and Unicode can be > considered a science. Would that be “math abuse” if we did it? (Is > linguistics not a science?) > > > (because these encodings are defective and don't have the necessary > > coverage, notably for the many diacritics, > > The combining diacritics would be used. > Not for the many precombined characters that are in Latin: do you intend to propose them to be reencoded with all the same variants encoded for maths? Or allow the maths symbols to have diacritics added on them (hint: this does not work correctly with the specific mathematical conventions on diacritics and their specific stacking rules: they are NOT reorderable through canonical equivalence, the order is significant in maths, so you would also need to use CGJ to fix the expected logical semantic and visual stacking order). > > > case mappings, > > Adjust them as needed. > Not so easy: case mappings cannot be fixed. They are stabilized in Unicode. You would need special casing rules under a specific "locale" for maths. Really maths is a specific script even if it borrows some symbols from Latin, Greek or Hebrew but only in specific glyph variants. These symbols should not be even considered as part of the script they originate from (just like Latin A is not the same as Cyrillic A or Greek Alpha, that all have the same forms and the same origin). I can argue tyhe same thing about IPA notations: they are NOT the Latin script and also borrow some letter forms from Latin and Greek, but without any case mappings (only lowercase is used), and also with specific glyph variants. Both examples are technical notations which do not obey the linguistic rules and normal processing of the script they originate from. They are specific "writing systems", unfortunaltely confused within "Unicode scripts", and then abused. Note that some Latin letters have been borrowed from IPA too, for use in African languages, then case mappings were needed: these should have been reencoded as a plain letter pair with a basic case mapping (not the special case mapping rules now needed for African languages, such as open o which looks much like the mirrored c from Latin Roman digits, and open e which was borrowed from Greek epsilon in lowercase but does not use the uppercase Greek Epsilon and uses instead another shape, meaning that the Latin open e should have been encoded as a plain letter pair, distinct from the Greek epsilon; but IPA already used the epsilon-like symbol...). At end these exceptions just cause many inconsistancies and complexities. Applications and libraries cannot adapt easily and are not downward compatible because stable properties are immutable and specific tailorings are needed each time in applications: the more we add these exceptions, the less the standard is easy to adapt and compatibility is much more difficult to preserve. In summary I don't like at all the dual encodings or encodings of additional letters that cannot use the normal stable properties (and this remark is also true for emojis: what a mess ! full of exceptions and different incoherent encoding models !)

