On 01/29/2015 08:19 PM, Philippe Verdy wrote:

2015-01-29 19:52 GMT+01:00 Karl Williamson <[email protected]
<mailto:[email protected]>>:

    Rule WB4 is

    "Ignore Format and Extend characters, except when they appear at the
    beginning of a region of text.".

    Not clearly stated, but it appears to me that the ZWJ must be
    considered here to be the beginning of a region of text, as we are
    looking at the boundary between it and the "A".  No rule
    specifically mentions ALetter followed by an Extend, so by the
    default rule, WB14

    "Otherwise, break everywhere (including around ideographs)"


All the text is targeted at finding candidate positions for breaks. It
is not very clear that "ignore" is definitive and means that there
cannot be any further breaks before the Format and Extend characters,
except at beginng of text. So all the rest of rules is ignored, there
was a match and you stop there; no break before;

   Any  × (Format | Extend)

This is confirmed in other rules that state the word "otherwise",
including the last one (WB14) you quote which is explciitly not applicable.

I don't understand you here. I understand all the words, but I don't see what you're trying to say. My claim is that there should be a rule:
as you give

 Any  × (Format | Extend)

but there isn't. I think you are maybe trying to say that the word "ignore" in this UAX is tantamount to such a rule. I am a native English speaker, and would never have drawn that inference from the text. There are a lot of passages in the Standard that sound like gibberish to me. I know the words' meanings, but the combination don't make any sense. I don't recall ever having this issue in other standards I've looked at.
_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to