Ah, looking at that the OpenType `pstf` feature seems relevant, though I cannot get it to crash with Gurmukhi (where the consonant ya is a postform)
-Manish On Sat, Feb 17, 2018 at 4:40 PM, Philippe Verdy <verd...@wanadoo.fr> wrote: > An interesting read: > > https://docs.microsoft.com/fr-fr/typography/script- > development/bengali#reor > > > 2018-02-18 1:30 GMT+01:00 Philippe Verdy <verd...@wanadoo.fr>: > >> My opinion about this bug is that Apple's text renderer dynamically >> allocates a glyphs buffer only when needed (lazily), but a test is missing >> for the lazy construction of this buffer (which is not needed for most >> texts not needing glyph substitutions or reordering when a single accessor >> from the code point can find the glyph data directly by lookup in font >> tables) and this is causing a null pointer exception at run time. >> >> The bug occurs effectively when processing the vowel that occurs after >> the ZWNJ, if the code assumes that there's a glyphs buffer already >> constructed for the cluster, in order to place the vowel over the correct >> glyph (which may have been reordered in that buffer). >> >> Microsoft's text renderer, or other engines use do not delay the >> constructiuon of the glyphs buffer, which can be reused for processing the >> rest of the text, provided it is correctly reset after processing a cluster. >> >> >> 2018-02-17 21:54 GMT+01:00 Manish Goregaokar <man...@mozilla.com>: >> >>> Heh, I wasn't aware of the word "phala-form", though that seems >>> Bengali-specific? >>> >>> Interesting observation about the vowel glyphs, I'll mention this in the >>> post. Initially I missed this because I hadn't realized that the bengali o >>> vowel crashed (which made me discount this). >>> >>> >>> Thanks! >>> >>> -Manish >>> >>> On Sat, Feb 17, 2018 at 12:22 PM, Philippe Verdy <verd...@wanadoo.fr> >>> wrote: >>> >>>> I would have liked that your invented term of "left-joining consonants" >>>> took the usual name "phala forms" (to represent RA or JA/JO after a virama, >>>> generally named "raphala" or "japhala/jophala"). >>>> >>>> And why this bug does not occur with some vowels is because these are >>>> vowels in two parts, that are first decomposed into two separate glyphs >>>> reordered in the buffer of glyphs, while other vowels do not need this >>>> prior mapping and keep their initial direct mapping from their codepoints >>>> in fonts, which means that this has to do to the way the ZWNJ looks for the >>>> glyphs of the vowels in the glyphs buffer and not in the initial codepoints >>>> buffer: there's some desynchronization, and more probably an uninitialized >>>> data field (for the lookup made in handling ZWNJ) if no vowel decomposition >>>> was done (the same data field is correctly initialized when it is the first >>>> consonnant which takes an alternate form before a virama, like in most >>>> Indic consonnant clusters, because the a glyph buffer is created. >>>> >>>> Now we have some hints about why the bug does not occur in Kannada or >>>> Khmer: a glyph buffer is always created, but there was some shortcut made >>>> in Devanagari, Bengali, and Telugu to allow processing clusters faster >>>> without having to create always a gyphs buffer (to allow reordering glyphs >>>> before positioning them), and working directly on the codepoints streams. >>>> >>>> So it seems related to the fact that OpenType fonts do not need to >>>> include rules for glyph substitution, but the PHALA forms are represented >>>> without any glyph substitution, by mapping directly the phala forms in a >>>> separate table for the consonants. Because there's been no code to glyph >>>> subtitution, the glyph buffer is not created, but then when processing the >>>> ZWNJ, it looks for data in a glyph buffer that has still not be initialized >>>> (and this is specific to the renderers implemented by Apple in iOS and >>>> MacOS). This bug does not occur if another text rendering engine is used >>>> (e.g. in non-Apple web browsers). >>>> >>>> >>>> 2018-02-16 19:44 GMT+01:00 Manish Goregaokar <man...@mozilla.com>: >>>> >>>>> FWIW I dissected the crashing strings, it's basically all <consonant, >>>>> virama, consonant, zwnj, vowel> sequences in Telugu, Bengali, Devanagari >>>>> where the consonant is suffix-joining (ra in Devanagari, jo and ro in >>>>> Bengali, and all Telugu consonants), the vowel is not Bengali au or o / >>>>> Telugu ai, and if the second consonant is ra/ro the first one is not also >>>>> ra/ro (or ro-with-line-through-it). >>>>> >>>>> https://manishearth.github.io/blog/2018/02/15/picking-apart- >>>>> the-crashing-ios-string/ >>>>> >>>>> -Manish >>>>> >>>>> On Thu, Feb 15, 2018 at 10:58 AM, Philippe Verdy via Unicode < >>>>> unicode@unicode.org> wrote: >>>>> >>>>>> That's probably not a bug of Unicode but of MacOS/iOS text renderers >>>>>> with some fonts using advanced composition feature. >>>>>> >>>>>> Similar bugs could as well the new advanced features added in Windows >>>>>> or Android to support multicolored emojis, variable fonts, contextual >>>>>> glyph >>>>>> transforms, style variants, or more font formats (not just OpenType); the >>>>>> bug may also be in the graphic renderer (incorrect clipping when drawing >>>>>> the glyph into the glyph cache, with buffer overflows possibly caused by >>>>>> incorrectly computed splines), and it could be in the display driver (or >>>>>> in >>>>>> the hardware accelerator having some limitations on the compelxity of >>>>>> multipolygons to fill and to antialias), causing some infinite recursion >>>>>> loop, or too deep recursion exhausting the stack limit; >>>>>> >>>>>> Finally the bug could be in the OpenType hinting engine moving some >>>>>> points outside the clipping area (the math theory may say that such >>>>>> plcement of a point outside the clipping area may be impossible, but >>>>>> various mathematical simplifcations and shortcuts are used to simplify or >>>>>> accelerate the rendering, at the price of some quirks. Even the SVG >>>>>> standard (in constant evolution) could be affected as well in its >>>>>> implementation. >>>>>> >>>>>> There are tons of possible bugs here. >>>>>> >>>>>> 2018-02-15 18:21 GMT+01:00 James Kass via Unicode < >>>>>> unicode@unicode.org>: >>>>>> >>>>>>> This article: >>>>>>> https://techcrunch.com/2018/02/15/iphone-text-bomb-ios-mac-c >>>>>>> rash-apple/?ncid=mobilenavtrend >>>>>>> >>>>>>> The single Unicode symbol referred to in the article results from a >>>>>>> string of Telugu characters. The article doesn't list or display the >>>>>>> characters, so Mac users can visit the above link. A link in one of >>>>>>> the comments leads to a page which does display the characters. >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >