Re: [HarfBuzz] hangul shaper patches

Jonathan Kew Mon, 20 Jan 2014 12:31:39 -0800

On 20/1/14 15:26, Dohyun Kim wrote:


I just have tested this kind of input string and the result is a
little disappointing:
Input string <U+1107,U+1109,U+1110,U+1161> does not rendered well. The
output of current (patched) harfbuzz with UnBatang font is
[uni1121=0+1024|uniD0C0=2+1024], the expected output being
[uniA972.xxxx|uni1161.xxxx]

IIRC, <U+1107,U+1109,U+1110> is *not* canonically equivalent to U+A972,even though it may be perfectly logical to spell the complex jamo as asequence of simpler jamo letters.


The reason seems to be that we are currently applying "ccmp" opentype
feature too late. If "ccmp" feature could be applied before the
process of hangul shaper, the issue would disappear.

Currently, this example fails because the pair <U+1110,U+1161> getscomposed to U+D0C0 during the preprocess_text function, and so by thetime any OpenType features are applied, it's too late.

Fixing this is tricky within the current structure of the shaper, as themain hangul shaper function needs to run before we map the Unicodecharacters to glyphs, but the ccmp feature needs to run after thedefault char-to-glyph mapping has been done.

Is this actually important? Note that Windows behaves similarly, and sodata that has "spelled-out" representations of complex jamos won't workthere either. AIUI, the recommended practice is to use the precomposedUnicode characters such as U+A972 directly - and because these do *not*have decompositions, mixing the two forms will lead to confusion andproblems for users. Perhaps it's better that the non-preferred spellingdoes not render "correctly".


JK

_______________________________________________
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] hangul shaper patches

Reply via email to