On Saturday, August 09, 2003 3:11 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote:
> Michael wrote: > > The Name Police reject this utterly. ZERO WIDTH cannot have an > > expanding dynamic width. > > Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238, > "can grow to have a visible width when justified"? And it has the > NamesList comment: > * nominally zero width, but may expand in justification > > (But U+0082, BREAK PERMITTED HERE, which otherwise is very similar > to ZWSP according to 6429, does apparently not allow such > stretching...) > > /kent k - ZERO WIDTH SPACE would be good only if it had not the "Zs" general category which qualifies it as a whitespace, and a word breaker (in fact the same problem occurs with the general category offered by SPACE or NBSP, which is a good reason why they are highly criticizable as base characters for word-like sequences (even if there's a NBSP, there is still a word delimitation which may be important for orthographic and grammatical analysis, given that the main difference between SPACE and NBSP is mostly the line-breaking behavior but not the word-breaking behavior.) - BREAK PERMITTED HERE is a control and does not qualify as a base character. In fact, depending on the usage, the gaps to fill depend on the usage: 1) when the isolated diacritic is to be used as a spacing symbol but which should not be force glued with surrounding characters, the NBSP base character is a problem, and in fact it also has the wrong character properties which normally applies to the whole combining sequence that should normally inherit the properties of the first base character. For this usage, we need something like an "INVISIBLE SYMBOL" base character (with gc=Sk like for other existing spacing diacritics, and probably with neutral directionality). The combining sequence will have its width adjusted to the largest diacritic(s) applied to that "INVISIBLE SYMBOL" base character. The nearest existing character to fit this function is ZWS, but it is whitespace, not symbolic. 2) when the isolated diacritic is to be used as a regular letter within words (e.g.: in Traditional Hebrew), we need something like a "INVISIBLE LETTER" base character (with gc=Lo and neutral directionality), whose width is not necessarily supposed to be adjusted but may adjust depending depending on the left or right context (in rendering engines), so that one could use an isolated circumflex between each character in the pair "oo", and the diacritic being centered on the touching edges of each surrounding spacing base character, or it would create a sufficient margin on either side to make the isolated diacritic fit. The resulting combining sequence with the INVISIBLE LETTER and its non-spacing diacritics would be mostly non-spacing. But this rendering may be tricky to implement in many cases, and the renderer should be allowed to render it as a spacing diacritic, like for the invisible symbol, except that it would not be a symbol but really a letter that can fit within a word (and have applications for elided letters in the middle of a unbreakable word). This function is partially implementable with CGJ only if there's a preceding combining sequence or base letter, or by WJ (Word Joiner) but it is a format control and not applicable as a base character. For texts that want to present the isolated diacritic for its related normal function as a diacritic, the current best solution is to use the existing (spacing) dotted circle symbol as the base character. However this usage is quite technical, and too much Unicode related, and is not appropriate for all usages, where the dotted circle symbol base character may conflict with other usage (in a document) of this symbol (some other documents also prefer using for such presentation forms a gray-coloured Latin small letter o in some rich text like HTML or RTF, but this still has the problem that a rich-text format like HTML will break the plain-text into separate sequences, where the non-grayed diacritic muct still be rendered on top of this separate sequence: which base character can be used in that case? there's currently none, except trying with ZWS (does not work always), but should better be a non-spacing INVISIBLE LETTER, rather than a spacing INVISIBLE SYMBOL (which by itself has no defined width but has just a minimum width 0). -- Philippe. Spams non tolérés: tout message non sollicité sera rapporté à vos fournisseurs de services Internet.