Heh, I wasn't aware of the word "phala-form", though that seems Bengali-specific?
Interesting observation about the vowel glyphs, I'll mention this in the post. Initially I missed this because I hadn't realized that the bengali o vowel crashed (which made me discount this). Thanks! -Manish On Sat, Feb 17, 2018 at 12:22 PM, Philippe Verdy <verd...@wanadoo.fr> wrote: > I would have liked that your invented term of "left-joining consonants" > took the usual name "phala forms" (to represent RA or JA/JO after a virama, > generally named "raphala" or "japhala/jophala"). > > And why this bug does not occur with some vowels is because these are > vowels in two parts, that are first decomposed into two separate glyphs > reordered in the buffer of glyphs, while other vowels do not need this > prior mapping and keep their initial direct mapping from their codepoints > in fonts, which means that this has to do to the way the ZWNJ looks for the > glyphs of the vowels in the glyphs buffer and not in the initial codepoints > buffer: there's some desynchronization, and more probably an uninitialized > data field (for the lookup made in handling ZWNJ) if no vowel decomposition > was done (the same data field is correctly initialized when it is the first > consonnant which takes an alternate form before a virama, like in most > Indic consonnant clusters, because the a glyph buffer is created. > > Now we have some hints about why the bug does not occur in Kannada or > Khmer: a glyph buffer is always created, but there was some shortcut made > in Devanagari, Bengali, and Telugu to allow processing clusters faster > without having to create always a gyphs buffer (to allow reordering glyphs > before positioning them), and working directly on the codepoints streams. > > So it seems related to the fact that OpenType fonts do not need to include > rules for glyph substitution, but the PHALA forms are represented without > any glyph substitution, by mapping directly the phala forms in a separate > table for the consonants. Because there's been no code to glyph > subtitution, the glyph buffer is not created, but then when processing the > ZWNJ, it looks for data in a glyph buffer that has still not be initialized > (and this is specific to the renderers implemented by Apple in iOS and > MacOS). This bug does not occur if another text rendering engine is used > (e.g. in non-Apple web browsers). > > > 2018-02-16 19:44 GMT+01:00 Manish Goregaokar <man...@mozilla.com>: > >> FWIW I dissected the crashing strings, it's basically all <consonant, >> virama, consonant, zwnj, vowel> sequences in Telugu, Bengali, Devanagari >> where the consonant is suffix-joining (ra in Devanagari, jo and ro in >> Bengali, and all Telugu consonants), the vowel is not Bengali au or o / >> Telugu ai, and if the second consonant is ra/ro the first one is not also >> ra/ro (or ro-with-line-through-it). >> >> https://manishearth.github.io/blog/2018/02/15/picking-apart- >> the-crashing-ios-string/ >> >> -Manish >> >> On Thu, Feb 15, 2018 at 10:58 AM, Philippe Verdy via Unicode < >> unicode@unicode.org> wrote: >> >>> That's probably not a bug of Unicode but of MacOS/iOS text renderers >>> with some fonts using advanced composition feature. >>> >>> Similar bugs could as well the new advanced features added in Windows or >>> Android to support multicolored emojis, variable fonts, contextual glyph >>> transforms, style variants, or more font formats (not just OpenType); the >>> bug may also be in the graphic renderer (incorrect clipping when drawing >>> the glyph into the glyph cache, with buffer overflows possibly caused by >>> incorrectly computed splines), and it could be in the display driver (or in >>> the hardware accelerator having some limitations on the compelxity of >>> multipolygons to fill and to antialias), causing some infinite recursion >>> loop, or too deep recursion exhausting the stack limit; >>> >>> Finally the bug could be in the OpenType hinting engine moving some >>> points outside the clipping area (the math theory may say that such >>> plcement of a point outside the clipping area may be impossible, but >>> various mathematical simplifcations and shortcuts are used to simplify or >>> accelerate the rendering, at the price of some quirks. Even the SVG >>> standard (in constant evolution) could be affected as well in its >>> implementation. >>> >>> There are tons of possible bugs here. >>> >>> 2018-02-15 18:21 GMT+01:00 James Kass via Unicode <unicode@unicode.org>: >>> >>>> This article: >>>> https://techcrunch.com/2018/02/15/iphone-text-bomb-ios-mac-c >>>> rash-apple/?ncid=mobilenavtrend >>>> >>>> The single Unicode symbol referred to in the article results from a >>>> string of Telugu characters. The article doesn't list or display the >>>> characters, so Mac users can visit the above link. A link in one of >>>> the comments leads to a page which does display the characters. >>>> >>> >>> >> >