On Friday, August 08, 2003 9:54 PM, Peter Kirk <[EMAIL PROTECTED]> wrote:

> On 08/08/2003 08:54, Philippe Verdy wrote:
> 
> But I'm not sure that ZERO WIDTH SYMBOL is the best name, unless you
> are suggesting other uses in which it really has zero width. Well, it
> might have in a case like line initial holam which shifts on to a
> following silent alef, but that is a rather special case.

I just picked "SYMBOL" to just match the required property that would match
other spacing variants of diacritics. The "ZERO WIDTH" is probably confusive, but it 
just marks the fact that it has no associated glyph and a null *minimum* width (which 
expands to the largest diacritic(s) with which it is combined).

Its main role would be to fill the gap for missing spacing versions of existing 
diacritics.

What about the name "INVISIBLE CARRIER SYMBOL" ? (note that I avoid any occurence of 
the term "COMBINING" in the name, because there would be no requirement for this 
character to be followed by any diacritic(s), but the character would itself be 
handled as a symbol, in a way similar to the existing spacing diacritics (that are 
already of category Sk, and are conceptually a combination of the INVISIBLE CARRIER 
SYMBOL and diacritics, defined for compatibility purpose as an approximation of the 
sequence SPACE+diacritic).

It is worth noting that for now it is quite tricky to get an isolated diacritic 
without getting deceptive results (in some cases, the only way to do it is by using 
what Unicode describes as "defective" combining sequences, not illegal by themselves 
but whose rendering and interpretation is not guaranteed.

On the opposite, Unicode offers a standard way to force the appearance of the dotted 
circle for an isolated diacritic, a function that may not always be desirable, using a 
dotted circle symbol as the base character.

As someone corrected me in this list, SPACE+combiningdiacritic is admitted in the 
standard, but only as a compatibility equivalence for spacing diacritics, where in 
fact the isolated spacing diacritic is really a symbol (gc=Sk), unlike the base SPACE 
character used in the compatibility decomposition (which has gc=Zs), meaning that 
SPACE+combining diacritic does not have the same textual semantics as the effectively 
already encoded spacing diacritics (all of them seem to have property gc=Sk, and are 
not considered as Letters with gc=Lo, and that's why I thought the name "SYMBOL" was 
accurate).

Also I tried to justify a possible codepoint assignment at U+20CF, where it would 
group more logically, given that the U+02XX block is already full and U+20XX is used 
for both symbols (including currencies) and a set of additional combining diacritics. 
Of course the U+20CF is just a suggestion, not something approved or documented.

-- 
Philippe.
Spams non tolérés: tout message non sollicité sera
rapporté à vos fournisseurs de services Internet.


Reply via email to