Some issues with TR14:

1) The version linked to from http://www.unicode.org/versions/Unicode4.0.0/ is an old version, http://www.unicode.org/reports/tr14/tr14-13.html.

2) I note from the latest version of TR14 (http://www.unicode.org/reports/tr14/) and the line breaking data (http://www.unicode.org/Public/UNIDATA/LineBreak.txt) that the characters 200C-200F, RLM/LRM/ZWJ/ZWNJ, have line breaking class CM. This has the rather peculiar consequence that a space followed by any of these characters is treated as ID, and so there is no line break opportunity at the beginning of the word. This might be desirable with ZWJ if taken as requesting some kind of ligature of the space and the following character. But it seems highly undesirable with RLM and LRM. These may be used at the beginning of a "word" made up of characters of undefined directionality, to ensure that it is rendered with the intended directionality even when separated from its context. But the line break before such "words" should not be prohibited.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to