Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

Peter Kirk Tue, 05 Aug 2003 18:07:23 -0700

On 05/08/2003 15:09, Mark Davis wrote:

<< Zs, Zl, and Zp are considered format characters, but their membership in the Z (separator) class takes precedence over their membership in the Cf class, because the General Category assigns

only

a single value to each character. >>
Whenever you have a question about the status of a character, you need
to look it up in the UCD. You can either do that by going through the
unicode website, or if you want a more readable interface, use the ICU
character browser, which formats that data.
Look at space, U+0020.

http://oss.software.ibm.com/cgi-bin/icu/ub/utf-8/?go=0020&ch.x=4&ch.y=7

The general category is Space_Separator, *not* a format character.
Now wording there could definitely be clearer, but the operant phrase
is:
...but their membership in the Z (separator) class *takes precedence* over their membership in the Cf class...

So it would be cleared to say something like:
In many ways the characters, Zs, Zl, and Zp, are similar to format
characters, but because their general usage is significantly different
they are broken out into a separate General Category, as Separator
characters.
Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄

Thank you, Mark. This helps to clarify things, but still doesn't explicitly answer my question of how to encode "a sentence like "In this language the diacritic ^ may appear above the letters ...", but instead of ^ I want to use a combining character" and want to display exactly one space before the combining character - do I encode two spaces or one?

--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

Reply via email to