Oh, also vatu. Seems like that ordering algorithm is indeed relevant.
-Manish On Sat, Feb 17, 2018 at 11:57 PM, Manish Goregaokar <[email protected]> wrote: > Ah, looking at that the OpenType `pstf` feature seems relevant, though I > cannot get it to crash with Gurmukhi (where the consonant ya is a postform) > > -Manish > > On Sat, Feb 17, 2018 at 4:40 PM, Philippe Verdy <[email protected]> > wrote: > >> An interesting read: >> >> https://docs.microsoft.com/fr-fr/typography/script-developme >> nt/bengali#reor >> >> >> 2018-02-18 1:30 GMT+01:00 Philippe Verdy <[email protected]>: >> >>> My opinion about this bug is that Apple's text renderer dynamically >>> allocates a glyphs buffer only when needed (lazily), but a test is missing >>> for the lazy construction of this buffer (which is not needed for most >>> texts not needing glyph substitutions or reordering when a single accessor >>> from the code point can find the glyph data directly by lookup in font >>> tables) and this is causing a null pointer exception at run time. >>> >>> The bug occurs effectively when processing the vowel that occurs after >>> the ZWNJ, if the code assumes that there's a glyphs buffer already >>> constructed for the cluster, in order to place the vowel over the correct >>> glyph (which may have been reordered in that buffer). >>> >>> Microsoft's text renderer, or other engines use do not delay the >>> constructiuon of the glyphs buffer, which can be reused for processing the >>> rest of the text, provided it is correctly reset after processing a cluster. >>> >>> >>> 2018-02-17 21:54 GMT+01:00 Manish Goregaokar <[email protected]>: >>> >>>> Heh, I wasn't aware of the word "phala-form", though that seems >>>> Bengali-specific? >>>> >>>> Interesting observation about the vowel glyphs, I'll mention this in >>>> the post. Initially I missed this because I hadn't realized that the >>>> bengali o vowel crashed (which made me discount this). >>>> >>>> >>>> Thanks! >>>> >>>> -Manish >>>> >>>> On Sat, Feb 17, 2018 at 12:22 PM, Philippe Verdy <[email protected]> >>>> wrote: >>>> >>>>> I would have liked that your invented term of "left-joining >>>>> consonants" took the usual name "phala forms" (to represent RA or JA/JO >>>>> after a virama, generally named "raphala" or "japhala/jophala"). >>>>> >>>>> And why this bug does not occur with some vowels is because these are >>>>> vowels in two parts, that are first decomposed into two separate glyphs >>>>> reordered in the buffer of glyphs, while other vowels do not need this >>>>> prior mapping and keep their initial direct mapping from their codepoints >>>>> in fonts, which means that this has to do to the way the ZWNJ looks for >>>>> the >>>>> glyphs of the vowels in the glyphs buffer and not in the initial >>>>> codepoints >>>>> buffer: there's some desynchronization, and more probably an uninitialized >>>>> data field (for the lookup made in handling ZWNJ) if no vowel >>>>> decomposition >>>>> was done (the same data field is correctly initialized when it is the >>>>> first >>>>> consonnant which takes an alternate form before a virama, like in most >>>>> Indic consonnant clusters, because the a glyph buffer is created. >>>>> >>>>> Now we have some hints about why the bug does not occur in Kannada or >>>>> Khmer: a glyph buffer is always created, but there was some shortcut made >>>>> in Devanagari, Bengali, and Telugu to allow processing clusters faster >>>>> without having to create always a gyphs buffer (to allow reordering glyphs >>>>> before positioning them), and working directly on the codepoints streams. >>>>> >>>>> So it seems related to the fact that OpenType fonts do not need to >>>>> include rules for glyph substitution, but the PHALA forms are represented >>>>> without any glyph substitution, by mapping directly the phala forms in a >>>>> separate table for the consonants. Because there's been no code to glyph >>>>> subtitution, the glyph buffer is not created, but then when processing the >>>>> ZWNJ, it looks for data in a glyph buffer that has still not be >>>>> initialized >>>>> (and this is specific to the renderers implemented by Apple in iOS and >>>>> MacOS). This bug does not occur if another text rendering engine is used >>>>> (e.g. in non-Apple web browsers). >>>>> >>>>> >>>>> 2018-02-16 19:44 GMT+01:00 Manish Goregaokar <[email protected]>: >>>>> >>>>>> FWIW I dissected the crashing strings, it's basically all <consonant, >>>>>> virama, consonant, zwnj, vowel> sequences in Telugu, Bengali, Devanagari >>>>>> where the consonant is suffix-joining (ra in Devanagari, jo and ro in >>>>>> Bengali, and all Telugu consonants), the vowel is not Bengali au or o / >>>>>> Telugu ai, and if the second consonant is ra/ro the first one is not also >>>>>> ra/ro (or ro-with-line-through-it). >>>>>> >>>>>> https://manishearth.github.io/blog/2018/02/15/picking-apart- >>>>>> the-crashing-ios-string/ >>>>>> >>>>>> -Manish >>>>>> >>>>>> On Thu, Feb 15, 2018 at 10:58 AM, Philippe Verdy via Unicode < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> That's probably not a bug of Unicode but of MacOS/iOS text renderers >>>>>>> with some fonts using advanced composition feature. >>>>>>> >>>>>>> Similar bugs could as well the new advanced features added in >>>>>>> Windows or Android to support multicolored emojis, variable fonts, >>>>>>> contextual glyph transforms, style variants, or more font formats (not >>>>>>> just >>>>>>> OpenType); the bug may also be in the graphic renderer (incorrect >>>>>>> clipping >>>>>>> when drawing the glyph into the glyph cache, with buffer overflows >>>>>>> possibly >>>>>>> caused by incorrectly computed splines), and it could be in the display >>>>>>> driver (or in the hardware accelerator having some limitations on the >>>>>>> compelxity of multipolygons to fill and to antialias), causing some >>>>>>> infinite recursion loop, or too deep recursion exhausting the stack >>>>>>> limit; >>>>>>> >>>>>>> Finally the bug could be in the OpenType hinting engine moving some >>>>>>> points outside the clipping area (the math theory may say that such >>>>>>> plcement of a point outside the clipping area may be impossible, but >>>>>>> various mathematical simplifcations and shortcuts are used to simplify >>>>>>> or >>>>>>> accelerate the rendering, at the price of some quirks. Even the SVG >>>>>>> standard (in constant evolution) could be affected as well in its >>>>>>> implementation. >>>>>>> >>>>>>> There are tons of possible bugs here. >>>>>>> >>>>>>> 2018-02-15 18:21 GMT+01:00 James Kass via Unicode < >>>>>>> [email protected]>: >>>>>>> >>>>>>>> This article: >>>>>>>> https://techcrunch.com/2018/02/15/iphone-text-bomb-ios-mac-c >>>>>>>> rash-apple/?ncid=mobilenavtrend >>>>>>>> >>>>>>>> The single Unicode symbol referred to in the article results from a >>>>>>>> string of Telugu characters. The article doesn't list or display >>>>>>>> the >>>>>>>> characters, so Mac users can visit the above link. A link in one of >>>>>>>> the comments leads to a page which does display the characters. >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >

