On Saturday, August 09, 2003 3:11 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote:

> Michael wrote:
> > The Name Police reject this utterly. ZERO WIDTH cannot have an
> > expanding dynamic width.
> 
> Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238,
> "can grow to have a visible width when justified"? And it has the
> NamesList comment:
> * nominally zero width, but may expand in justification
> 
> (But U+0082, BREAK PERMITTED HERE, which otherwise is very similar
> to ZWSP according to 6429, does apparently not allow such
> stretching...) 
> 
> /kent k

- ZERO WIDTH SPACE would be good only if it had not the "Zs" general
category which qualifies it as a whitespace, and a word breaker (in fact
the same problem occurs with the general category offered by SPACE
or NBSP, which is a good reason why they are highly criticizable as
base characters for word-like sequences (even if there's a NBSP, there
is still a word delimitation which may be important for orthographic
and grammatical analysis, given that the main difference between SPACE
and NBSP is mostly the line-breaking behavior but not the word-breaking
behavior.)

- BREAK PERMITTED HERE is a control and does not qualify as a base
character.

In fact, depending on the usage, the gaps to fill depend on the usage:

1) when the isolated diacritic is to be used as a spacing symbol but which
should not be force glued with surrounding characters, the NBSP base
character is a problem, and in fact it also has the wrong character
properties which normally applies to the whole combining sequence
that should normally inherit the properties of the first base character.
For this usage, we need something like an "INVISIBLE SYMBOL"
base character (with gc=Sk like for other existing spacing diacritics,
and probably with neutral directionality). The combining sequence
will have its width adjusted to the largest diacritic(s) applied to that
"INVISIBLE SYMBOL" base character. The nearest existing character
to fit this function is ZWS, but it is whitespace, not symbolic.

2) when the isolated diacritic is to be used as a regular letter within
words (e.g.: in Traditional Hebrew), we need something like a "INVISIBLE
LETTER" base character (with gc=Lo and neutral directionality), whose
width is not necessarily supposed to be adjusted but may adjust depending
depending on the left or right context (in rendering engines), so that one could
use an isolated circumflex between each character in the pair "oo", and the
diacritic being centered on the touching edges of each surrounding spacing
base character, or it would create a sufficient margin on either side to make
the isolated diacritic fit. The resulting combining sequence with the INVISIBLE
LETTER and its non-spacing diacritics would be mostly non-spacing.
But this rendering may be tricky to implement in many cases, and the
renderer should be allowed to render it as a spacing diacritic, like for the
invisible symbol, except that it would not be a symbol but really a letter that
can fit within a word (and have applications for elided letters in the middle of
a unbreakable word). This function is partially implementable with CGJ only
if there's a preceding combining sequence or base letter, or by WJ (Word
Joiner) but it is a format control and not applicable as a base character.

For texts that want to present the isolated diacritic for its related normal
function as a diacritic, the current best solution is to use the existing
(spacing) dotted circle symbol as the base character. However this usage
is quite technical, and too much Unicode related, and is not appropriate
for all usages, where the dotted circle symbol base character may conflict
with other usage (in a document) of this symbol (some other documents
also prefer using for such presentation forms a gray-coloured Latin small
letter o in some rich text like HTML or RTF, but this still has the problem
that a rich-text format like HTML will break the plain-text into separate
sequences, where the non-grayed diacritic muct still be rendered on top
of this separate sequence: which base character can be used in that
case? there's currently none, except trying with ZWS (does not work
always), but should better be a non-spacing INVISIBLE LETTER, rather
than a spacing INVISIBLE SYMBOL (which by itself has no defined width
but has just a minimum width 0).

-- 
Philippe.
Spams non tolérés: tout message non sollicité sera
rapporté à vos fournisseurs de services Internet.


Reply via email to