At 05:52 AM 11/20/2003, Philippe Verdy wrote:
We need a comprehensive new technical report that lists all the exceptions
to the general category system, as these line-breaking or word-breaking or
grapheme cluster breaking properties are orthogonal to the basic GC system
and to the combining class system.

No we don't.


The GC is quite limited. It can at best capture the 'primary' classification
of a character. For many characters, esp. in category Cf all it knows is
that the character has some behavior that could be interesting, but is silent
on what that behavior is. The same is largely true for all the P* and Z*
classes, where for line and word breaking, the rules are more fine grained.

We have two UAXs that deal in detail with these two subjects. Adding a third
UAX on top, does not solve a thing.

The expectation that you can derive useful knowledge of text and line boundary
detection from just GC and CC is misguided. You need additional information.

A./





Reply via email to