points, but all the other planes are still present in ISO/IEC 10646,
some of
them being still allocated to PUAs that don't have equivalents in
Unicode,
The PUAs above Plane 10 (hex) have been removed from 10646.
Furthermore, the UTF-8 specification for IETF protocol use has been
updated to
From: Kent Karlsson [EMAIL PROTECTED]
The PUAs above Plane 10 (hex) have been removed from 10646.
Furthermore, the UTF-8 specification for IETF protocol use has been
updated to refer to the Unicode specification of UTF-8. At some
point, the formal specification of UTF-8 in 10646 should be
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677
N2677
Proposal for six Hexadecimal digits
Ricardo Cancho Niemietz - individual contribution
2003-10-21
Oh my. Other fun things to think about in relation to this:
http://web.andrewg.com/hexdigits/
Discussed on this list in December 2000:
Are the characters ZWJ, ZWNJ and CGJ base characters, combining
characters, neither, or even both? Which specific character properties
should I look at to decide this?
Are these characters legal within combining character sequences? Can ZWJ
and ZWNJ be used to control ligation of combining
I'm not sure what sort of publicity and outreach
goes with setting up the IUC, but with the next one being held in Washington, it
would seem to be a great opportunity toinvolve various actors in ICT for
development that are based in DC area. Also representatives of various
governments from
When I first heard about hexadecimal, I thought that using A-F for
digits lacked imagination, and risked confusion with letters besides. I
made up a set of digits, as I recall, and even names for them.
I'm not completely convinced this is a bad idea. But it's likely.
~mark
Michael Everson
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
OK but you probably did not notice that the Corrigendum #1, published
after Unicode 4.0, refers to a backward version of ISO/IEC
10646-1:2000, which is not the version ISO/IEC 10646:2003 refered in
Unicode 4.0. This added reference in
The UTC just approved a clarification
of the base character definition, as follows:
D13a Graphic character: a character with the General Categories of
Letter (L), Combining Mark (M), Number (N), Punctuation (P), Symbol (S), or
Space Separator (Zs).
Graphic characters specifically exclude
(followup) And for checking character properties without having to delve into
the UCD data files, try the ICU Demo at:
http://oss.software.ibm.com/cgi-bin/icu/ub/utf-8/?ch=200B
Mark
__
http://www.macchiato.com
- Original Message -
From: Peter Kirk
On 08/11/2003 15:52, Mark Davis wrote:
The UTC just approved a clarification of the base character
definition, as follows:
...
Thank you, Mark. This clarification is useful.
So I conclude that ZWJ and ZWNJ, General Category Cf, are not graphic
characters and so neither base characters nor
I agree with the first part of your analysis. By the phrase requesting ligation
of combining characters it is unclear to me what you mean, and whether that is
the right solution to whatever problem you are referring to.
Mark
__
http://www.macchiato.com
-
I'm curious about what name you would give to it.
The name COMBINING CHARACTER JOINER is already used...
In all our discussions we should have used the term starter (instead of
just base character which is ambiguous) for any characters of combining
class 0 and which include:
Base characters
You are stating many things as if they were facts, when they are simply not
true. You should verify them against the definitions before stating them in such
a 'definitive' way.
Examples:
- VS1 is a combining character, and not a base character.
13 matches
Mail list logo