On Sat, 10 Aug 2019 11:22:01 +0100 Andrew West via Unicode <unicode@unicode.org> wrote:
> On Sat, 10 Aug 2019 at 08:29, Richard Wordingham via Unicode > <unicode@unicode.org> wrote: > > > > There are similar issues with Tibetan; some fonts do not work > > properly if a vowel below (ccc=132) is separated from the base of > > the consonant stack by a vowel above (ccc=130). > > It's not that the fonts don't work, it's that some the rendering > engines do not apply the OpenType features in the font that support > both sequences of vowels (vowel-above followed by vowel-below, and > vowel-below followed by vowel-above). My observation was based on a Tibetan font that failed when pre-USE HarfBuzz added or changed the normalisation for Tibetan. > Just retested on Windows 10 with > a Tibetan font that supports both sequences of vowels, and both > sequences display correctly under Harfbuzz (as expected), but only > vowel-below followed by vowel-above displays correctly when using > built-in Windows rendering. Does vowel above before vowel below yield a dotted circle? According to the documentation - and the USE may have been improved in undocumented ways - the blwf feature will not apply across a Tibetan sequence of vowel above (VBlw) followed by vowel below (Vabv or CMBlw), but the blws feature will, even if a dotted circle has been added at the boundary. > It is very frustrating that Windows cannot correctly support the > display of Tibetan in normalized form, yet Harfbuzz does not have any > problems. Personally, I think USE is a failed experiment, and I wish > Microsoft would simply adopt Harfbuzz as the default rendering engine. >From what I've seen from discussions on HarfBuzz, the USE seems to work well for non-Indic scripts and Devanagari clones - possibly even for Bengali clones. It's also a definition that HarfBuzz can fall back on. The problems is that it doesn't address the quirks of scripts, and its anti-spoofing measures are draconian and overdone. There may well be an issue of funding for the USE - for all I know, it may in part be charity work. If Microsoft gave up on rendering engines, who would write the rendering specifications for HarfBuzz? I was wondering how the USE might be modified to handle canonical equivalence. The simplest way may be to permute the canonical combining classes, normalise (NFD) according to these classes, and process the rearranged string. That's roughly what HarfBuzz does. Another technique would be to derive regular expressions that would match any string canonically equivalent to a string matching the original regular expressions and use them instead. (It may be simpler to derive a regular expression that finds matches from amongst normalised strings - that's what my canonical equivalence respecting regular expression does.) Using a different canonical equivalent to the present one could 'break' fonts whose sets of properly handled strings were not closed under canonical equivalence - which is why I asked the original question. Richard.