On Wed, 16 Sep 2015 22:56:42 +0100 Daniel Bünzli <daniel.buen...@erratique.ch> wrote:
> Le mercredi, 16 septembre 2015 à 22:14, Asmus Freytag (t) a écrit : > > "N" doesn't mean "narrow" but "neutral" - that is, the width is > > given by other consideration. > > Ah right ! Thanks. Narrow is Na. > > So a refined algorithm would be to actually do the summation in each > grapheme cluster as I initially wanted to do with the mapping (F, W > -> 2), (Na, H -> 1) (N -> 0) and if I get a 0 fallback on 1 or maybe > try to make an educated guess according to the script/block. I think you have a problem with U+302E HANGUL SINGLE DOT TONE MARK and U+302F HANGUL DOUBLE DOT TONE MARK, contrary to what I said earlier. They are preposed combining marks with Grapheme_Extend=Yes and EAW=Wide. I'm not sure whether the (legacy & extended) grapheme cluster <U+AC00, U+302E> should occupy 2, 3 or 4 cells. I think 2 cells is wrong, so summation works better, contrary to what I said earlier. Does anyone know how EAW=Wide was derived for these characters? Apparently they were wide even when they were non-spacing marks (gc=Mn), e.g.. in Unicode Version 5.0, so I suspect the were not given individual consideration. I suspect they should be EAW=A(mbiguous). Richard.