FW to Unicode ml From: ernestvandenbooga...@hotmail.com To: jsb...@mimuw.edu.pl Subject: RE: statistics Date: Tue, 12 Oct 2010 10:13:17 +0200
In 5.2, Chapter 2.4 table 2-3 is listed which General Categories are "characters". Out are: Surrogates, Private Use, Non-characters and Reserved code points. Note that Format characters (Cf) are included as characters. The code points with formatting aspects in C0 and C1 are Controls ("Cc"), so excluded. Total number of characters in 6.0 is 109,242+142=109,384. Regards, Ernest van den Boogaard > From: jsb...@mimuw.edu.pl > To: asm...@ix.netcom.com > CC: unicode@unicode.org > Subject: Re: statistics > Date: Tue, 12 Oct 2010 09:14:21 +0200 > > On Mon, 11 Oct 2010 Asmus Freytag <asm...@ix.netcom.com> wrote: > > > On 10/11/2010 9:49 PM, Janusz S. "Bień" wrote: > >> On Mon, 11 Oct 2010 announceme...@unicode.org wrote: > >> > >>> The newly finalized Unicode Version 6.0 adds 2,088 characters, > >> What is the current total? Are other statistic informations available > >> somewhere? > > The announcement gives a link to click through. > > > > There you will find more statistics. > > I guess you mean "Character Assignment Overview" at > > http://www.unicode.org/versions/Unicode6.0.0/ > > However it does not provide the precise answer to my primary question, > which is not purely arithmetic but depends on the definition of the > character. In particular, do noncharacters belong to characters? > > Regards > > JSB > > -- > , > dr hab. Janusz S. Bien, prof. UW - Uniwersytet Warszawski (Katedra > Lingwistyki Formalnej) > Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics) > jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/ > >