> << Zs, Zl, and Zp are considered format characters, but their > membership in the Z (separator) class takes precedence over their > membership in the Cf class, because the General Category assigns only > a single value to each character. >>
Whenever you have a question about the status of a character, you need to look it up in the UCD. You can either do that by going through the unicode website, or if you want a more readable interface, use the ICU character browser, which formats that data. Look at space, U+0020. http://oss.software.ibm.com/cgi-bin/icu/ub/utf-8/?go=0020&ch.x=4&ch.y=7 The general category is Space_Separator, *not* a format character. Now wording there could definitely be clearer, but the operant phrase is: > ...but their > membership in the Z (separator) class *takes precedence* over their > membership in the Cf class... So it would be cleared to say something like: In many ways the characters, Zs, Zl, and Zp, are similar to format characters, but because their general usage is significantly different they are broken out into a separate General Category, as Separator characters. Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄ ----- Original Message ----- From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Mark Davis" <[EMAIL PROTECTED]> Cc: "Unicode List" <[EMAIL PROTECTED]> Sent: Tuesday, August 05, 2003 14:50 Subject: Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...) > On 05/08/2003 14:40, Mark Davis wrote: > > >Where did you get the notion that space is not a base character? And > >base characters include those that are not control or format > >characters. Space is neither one. > > > >The standard specifically states in a number of places that to exhibit > >a combining mark in isolation you use a space (or NBSP). > > > >Mark > >__________________________________ > >http://www.macchiato.com > >► “Eppur si muove” ◄ > > > > > > > I got this from the Unicode Standard 4.0, as quoted by Jim Allan: > > > In http://www.unicode.org/book/preview/ch03.pdf the space characters > > in general are given class Zs: > > > > << Zs, Zl, and Zp are considered format characters, but their > > membership in the Z (separator) class takes precedence over their > > membership in the Cf class, because the General Category assigns only > > a single value to each character. >> > > > > So the various space characters (class Zs) are also classified as > > format characters. > > > > From http://www.unicode.org/book/ch04.pdf: > > > > << _D13 Base character:_ a character that does not graphically > > combine with preceding character, and that is neither control nor a > > format character. >> > > > > Accordingly, by definition, spaces are not base characters. > > > > -- > Peter Kirk > [EMAIL PROTECTED] > http://web.onetel.net.uk/~peterkirk/ > > >