RE: What's in a wchar_t string ...

2004-03-04 Thread Winkler, Arnold F
Folks,

Since "ISO/IEC 9899 - Programming Language C" was quoted, I wonder if
you are aware of the efforts of SC22/WG14 to develop a Technical Report
that deals with the problems discussed in this thread.  

The document is ISO/IEC DTR 19769 - Extensions for the programming
language C to support new character data types

The project is currently in DTR ballot and will, when approved,
certainly take some time to be implemented in C-compilers and in
operating systems.  But it gives a good indication, in which direction
the formal standardization is going with data types in C language.

Here are some excerpts from the DTR 19769:

Quote:
3 The new typedefs
This Technical Report introduces the following two new typedefs,
char16_t and
char32_t :
typedef T1 char16_t;
typedef T2 char32_t;
where T1 has the same type as uint_least16_t and T2 has the same type as
uint_least32_t.
The new typedefs guarantee certain widths for the data types, whereas
the width of
wchar_t is implementation defined. The data values are unsigned, while
char and
wchar_t could take signed values.
This Technical Report also introduces the new header:

The new typedefs, char16_t and char32_t, are defined in 

4 Encoding
C99 subclause 6.10.8 specifies that the value of the macro _
_STDC_ISO_10646_ _
shall be "an integer constant of the form mmL (for example,
199712L), intended
to indicate that values of type wchar_t are the coded representations of
the
characters defined by ISO/IEC 10646, along with all amendments and
technical
corrigenda as of the specified year and month." C99 subclause 6.4.5p5
specifies that wide string literals are initialized with a sequence of
wide characters as defined by the mbstowcs function with an
implementation-defined current locale. Analogous to this macro, this
Technical Report introduces two new macros.

If the header  defines the macro _ _STDC_UTF_16_ _, values of
type
char16_t shall have UTF-16 encoding. This allows the use of UTF-16 in
char16_t
even when wchar_t uses a non-Unicode encoding. In certain cases the
compile-time
conversion to UTF-16 may be restricted to members of the basic character
set and
universal character names (\U and \u) because for these the
conversion
to UTF-16 is defined unambiguously.

If the header  defines the macro _ _STDC_UTF_32_ _, values of
type
char32_t shall have UTF-32 encoding.

If the header  does not define the macro _ _STDC_UTF_16_ _, the
encoding of char16_t is implementation defined. Similarly, if the header
 does not define the macro _ _STDC_UTF_32_ _, the encoding of
char32_t is implementation defined.

An implementation may define other macros to indicate a different
encoding. 
Unquote

The document, which of course is copyrighted by ISO, starts with a nice
introduction that defines the problem.  In addition to the excerpts
above, it also addresses the following subjects:
5 String literals and character constants
5.1 String literals and character constants notations
5.2 The string concatenation 
6 Library functions
6.1 The mbrtoc16 function 
6.2 The c16rtomb function 
6.3 The mbrtoc32 function 
6.4 The c32rtomb function 
7 ANNEX A Unicode encoding forms: UTF-16, UTF-32 

Best regards
Arnold




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Nelson H. F. Beebe
Sent: Wednesday, March 03, 2004 1:49 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: What's in a wchar_t string ...

"Frank Yung-Fong Tang" <[EMAIL PROTECTED]> asks on Wed, 3 Mar 2004
12:38:49
-0500:

>>  Does it also mean wchar_t is 4 bytes if __STDC_ISO_10646__ is
defined?
>> or does it only mean wchar_t hold the character in ISO_10646
>> (which mean it could be 2 bytes, 4 bytes or more than that?)

Here is the exact text from

INTERNATIONAL ISO/IEC STANDARD 9899
Second edition
1999-12-01
Programming languages -- C

>> ...
>>  __STDC_ISO_10646__ An integer constant of the form mmL (for
>> example, 199712L), intended to indicate
>> that values of type wchar_t are the coded
>> representations of the characters defined
>> by ISO/IEC 10646, along with all amendments
>> and technical corrigenda as of the
>> specified year and month.
>> ...

It says nothing more about the size of wchar_t, or what encodings are
used: note the vague language "coded representations...".  This means
effectively that the implementation, not the Standard, decides.

Very few current Unix C or C++ compilers even define the symbol
__STDC_ISO_10646__; the C/C++ feature test package at

ftp://ftp.math.utah.edu/pub/features
http://www.math.utah.edu/pub/features

probes that macro value, and many others.  

My logs of its runs in about 90 build environments show definitions
with values 29 for GNU gcc versions 3.x (all platforms), Intel icc
versions 7.x and 8.0 (Intel IA-32 and

RE: UNICODE & OTHER STANDARDS

2003-12-29 Thread Winkler, Arnold F
And of course:
COBOL, FORTRAN, C, C++, POSIX, 10176 Characters for identifiers in
programming languages, 14651 string ordering, 15897 registry of cultural
elements, the 8859 family, 15924 names of script, 19769 new character types
in C , and more ...
Arnold
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Patrick Andries
Sent: Monday, December 29, 2003 2:38 PM
To: Markus Scherer; unicode
Subject: Re: UNICODE & OTHER STANDARDS


- Message d'origine - 
De: "Markus Scherer" <[EMAIL PROTECTED]>

> It looks to me like Christopher is not after an analysis of what standards
could somehow be squeezed
> to use Unicode charsets, but rather a list of standards that _specify_
(actively, not potentially)
> Unicode/10646.
>
> The obvious ones are of course
> HTML (at least since 4.01:
http://www.w3.org/TR/html401/charset.html#h-5.1)
> XML
> ECMAScript
>
> I do not have a complete list.

Another one : ISO 14651 (collation), I believe.

Ken Whistler (or Alain Labonté) can confirm (or deny) this.

P. A.








RE: Stability of WG2

2003-12-16 Thread Winkler, Arnold F



Jill,
 
Speaking as an Austrian, I don't care why the UK does not 
participate in SC2/WG2.
 
But I DO appreciate the information, that I am not going to 
see an answer to this question.  Please be kind to 
Michael.
 
Regards
Arnold

  
  
  From: Arcane Jill 
  [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 16, 2003 
  8:41 AMTo: [EMAIL PROTECTED]Subject: RE: Stability of 
  WG2
  Speaking as a Brit, I would like to know the answer to this one 
  too. What's the problem with answering online?And if you're 
  really not going toanswer this online, you could have just emailed 
  Peter privately, instead of telling the whole list that you're going to keep 
  the answer secret from all of us except Peter. What a wind 
  up!Jill> -Original Message-> From: 
  Michael Everson [mailto:[EMAIL PROTECTED]]> 
  Sent: Tuesday, December 16, 2003 12:49 PM> To: [EMAIL PROTECTED]> Subject: Re: 
  Stability of WG2> > > At 04:36 -0800 2003-12-16, Peter 
  Kirk wrote:> > >Seriously, can you remind us briefly what the 
  situation is, why > >there is no current UK representation?> 
  > I will answer this off-line.> -- > Michael Everson * * 
  Everson Typography *  * http://www.evertype.com> 



RE: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-21 Thread Winkler, Arnold F
Jill,

The standard is available at
http://www.techstreet.com/cgi-bin/detail?product_id=232462

It is a bargain, the PDF file goes for $18.00 (yes, eighteen USD).  The
printed version is somewhat more expensive, $220.00.

Go order it, and your desire for a reference will be satisfied.

Regards
Arnold

==

-Original Message-
From: Jill Ramonsky [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 21, 2003 8:31 AM
To: [EMAIL PROTECTED]
Subject: RE: Backslash n [OT] was Line Separator and Paragraph Separator



I am very happy to be corrected.
Thank you very much.

I would also greatly appreciate the "chapter and verse" ... not because 
I want to carry on arguing (I don't), but simply because I would very 
much like to have that standard available to me as a reference work.

Thanks again, and my apologies John,
Jill


 > -Original Message-
 > From: John Cowan [mailto:[EMAIL PROTECTED]
 > Sent: Tuesday, October 21, 2003 1:19 PM
 > To: Jill Ramonsky
 > Cc: [EMAIL PROTECTED]
 > Subject: Re: Backslash n [OT] was Line Separator and
 > Paragraph Separator
 >
 >
 > Jill Ramonsky scripsit:
 >
 > > This is axiomatically *THE* definition. Period. Everything else is
 > > merely quoting, rephrasing or reinterpretting this original.
 >
 > Absolutely not.  The *standard* for the C programming language is now
 > ISO/IEC 9899.


 > Anyone have the standard handy to quote chapter and verse?
 >





RE: Aramaic, Samaritan, Phoenician

2003-07-15 Thread Winkler, Arnold F
I grew up in Austria more than 50 years ago, and trust me, cursive script
was already ancient then.  Yes, we had to learn it (1945 - 1948) in primary
school, but even then it was not used any more (except for some VERY old
people with grey or no hair at all).

I might still be able to read it, but I was never able to write it legibly.
Just checked with my children - writing cursive had disappeared from the
schools altogether before the 1960's. 

Arnold

PS.: I blame the fact that I had to learn to write cursive for my lousy
handwriting today - at least it is a good excuse.

-Original Message-
From: Michael Everson [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 15, 2003 9:54 AM
To: [EMAIL PROTECTED]
Subject: Re: Aramaic, Samaritan, Phoenician


At 08:42 -0400 2003-07-15, Karljürgen Feuerherm wrote:
>  Michael Everson said:
>  > My native script isn't Hebrew but I am certain that no one who was
could
>  > easily read a newspaper article written in Phoenician or Samaritan
letters.
>
>Surely that is not an argument for encoding a separate script, is it?

It is sometimes. :-)

>Most German people I know can't read the German 
>cursive script used say 50 years ago. But the 
>characters clearly correspond to the Latin 
>characters in use today.

The handwriting is difficult to read. One would 
think that in German schools it would be at least 
introduced so children would know about it.
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com



RE: Historians- what is origin of i18n, l10n, etc.?

2002-10-10 Thread Winkler, Arnold F

Hideki, 

You are most likely right that I18N was used much earlier than I was able to
witness.  I entered the standards game in 1989 (X3/L2) and started with the
POSIX activity sometime in 1991.  

Thanks for remembering.

Arnold

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 10, 2002 10:18 AM
To: Winkler, Arnold F
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: Historians- what is origin of i18n, l10n, etc.?


> From: "Winkler, Arnold F" <[EMAIL PROTECTED]>
> Sometime around 1991 in a IEEE P1003.1 (POSIX) meeting, Gary Miller (IBM)
> was writing on the blackboard.  After having spelled out
> Internationalization a few times, he first abbreviated it to I--n and a
bit
> later (obviously after counting the letters in between) used I18N.  Sandra
> might have been at the meeting, and Keld - they might be able to confirm
my
> recollection.

The acronym "I18N" appeared before 1991, since I recall I have
already used I18N in '89 ;-).

The beginning of this kind of acronym was S12N(Scherpenhuizen) at
DEC, as far as on the record, as an email address for him on DEC VMS.

By 1985, I18N became an acronym for Internationalization in the I18N 
team at DEC, by following this Scherpenhuizen's S12N convention.

Among the standard organizations, the /usr/group (It became UniForum
later) was the first one using I18N as an acronym for
Internationalization, in '88.

--
hiura@{freestandards.org,OpenI18N.org,li18nux.org,unicode.org,sun.com} 
Chair, Li18nux/Linux Internationalization Initiative,
http://www.li18nux.org
Board of Directors, Free Standards Group,
http://www.freestandards.org
Architect/Sr. Staff Engineer, Sun Microsystems, Inc, USA   eFAX:
509-693-8356





RE: Historians- what is origin of i18n, l10n, etc.?

2002-10-10 Thread Winkler, Arnold F

Tex,

Here is my recollection:

Sometime around 1991 in a IEEE P1003.1 (POSIX) meeting, Gary Miller (IBM)
was writing on the blackboard.  After having spelled out
Internationalization a few times, he first abbreviated it to I--n and a bit
later (obviously after counting the letters in between) used I18N.  Sandra
might have been at the meeting, and Keld - they might be able to confirm my
recollection.

L10N did not show up until quite some time later.  I have no idea who used
it first.

Regards
Arnold

-Original Message-
From: Tex Texin [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 10, 2002 2:02 AM
To: NE Localization SIG; Unicoders
Subject: Historians- what is origin of i18n, l10n, etc.?


I was asked about the origin of these acronyms. Does anyone know who
created these or where they were first used?
tex
-- 
-
Tex Texin   cell: +1 781 789 1898   mailto:[EMAIL PROTECTED]
Xen Master  http://www.i18nGuy.com
 
XenCrafthttp://www.XenCraft.com
Making e-Business Work Around the World
-




RE: OCR characters

2002-08-16 Thread Winkler, Arnold F

Folks, that is my VERY LAST post on this VERY OLD subject:

In the L2 document register I found L2/98-397
http://www.unicode.org/L2/L2/98396.pdf which is a proposal for ISO/IEC TR
15907, a Type 3 TR for the revision of ISO 1073/II:1976.  

On page 18 is a note that says:

NOTE – The glyphs previously defined with reference numbers 120 (CHARACTER
ERASE) and 121 (GROUP ERASE) have been deleted.

That's the end of my digging in older documents.

And have a nice weekend too !

Arnold



> -Original Message-
> From: Otto Stolz [mailto:[EMAIL PROTECTED]]
> Sent: Friday, August 16, 2002 10:30 AM
> To: Winkler, Arnold F
> Cc: Eric Muller; [EMAIL PROTECTED]
> Subject: Re: OCR characters
> 
> 
> Eric Muller had written:
> 
> > In our OCR fonts, we have two glyphs named "erase" [...]
> 
> > and "grouperase" [...] I suspect those are mandated by these 
> > standards. On the other hand, and I can't find traces of those in 
> > Unicode,
> 




RE: OCR characters

2002-08-16 Thread Winkler, Arnold F

Otto,

I am looking at ISO 1073/II-1976:

The two erase characters are the only members of set #5, reference numbers
are 120 and 121.  The "Remarks" column is empty.  6.4 says : Application
advise is given in the column "Remarks", where it is indicated, inter alia,
which characters are included for general purpose use only and should not be
used for OCR purposes.  (I guess, an empty column means that the character
can be used for OCR).

I have not found any more information in ISO 1073/II:1976.  Sorry

Arnold

> -Original Message-
> From: Otto Stolz [mailto:[EMAIL PROTECTED]]
> Sent: Friday, August 16, 2002 10:30 AM
> To: Winkler, Arnold F
> Cc: Eric Muller; [EMAIL PROTECTED]
> Subject: Re: OCR characters
> 
> 
> Eric Muller had written:
> 
> > In our OCR fonts, we have two glyphs named "erase" [...]
> 
> > and "grouperase" [...] I suspect those are mandated by these 
> > standards. On the other hand, and I can't find traces of those in 
> > Unicode,
> 
> 
> Arnold F. Winkler wrote:
>  > I believe, Eric is talking about the characters on the 
> attached page 8 of
>  > the OCR standard.
> 
> I don't have ISO 1073 at hand,  only the German
> - DIN 66 008 (Jan 1978), which is essentially identical with ISO 
> 1073/I-1976,
>and
> - DIN 66 009 (Sept. 1977), which is based on, but not identical with,
>ISO 1073/II-1976.
> 
> DIN 66 008 contains the figure reported by Arnold Winkler. 
> This standard
> does not specify the intended usage of these characters -- 
> not beyond their
> expressive names.
> 
> DIN 66 009 says about the equivalent OCR-B characters (my 
> translation):
>  > In case of a typo, a keyboard-driven device will print the 
> Character 
> Erase
>  > on top of an erroneous character. This will cause the OCR 
> reading device
>  > to ignore this position.
>  > The Group Erase may be either drawn by hand, or printed as 
> discussed in
>  > the previous paragraph. It will cause the OCR reading 
> device to ignore
>  > this position.
> 
> So, these characters would never be read by an OCR device. 
> They would be
> printed only in response to a function key (such as Erase 
> Backwards), but
> never sent (encoded as characters) to a device. This means, 
> that they will
> not normally be encoded, hence there will probably no need to 
> assgin Uni-
> codes to them.
> 
> The only exception could be a text discussing these characters, and
> their usage. I think, this sort of text would use figures rather than
> characters, to show the effect of overprinting in several variants.
> (The Erase, and the erased, character's positions may 
> slightly differ.)
> 
> So I guess, these characters are deliberately left off Unicode.
> 
> Best wishes,
>Otto Stolz
> 




RE: OCR characters

2002-08-16 Thread Winkler, Arnold F

I believe, Eric is talking about the characters on the attached page 8 of
the OCR standard.

Regards
Arnold

> -Original Message-
> From: Eric Muller [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, August 15, 2002 7:44 PM
> To: [EMAIL PROTECTED]
> Subject: OCR characters
> 
> 
> In our OCR fonts, we have two glyphs named "erase" (looks 
> like a black 
> square) and "grouperase" (looks like a long dash). I don't 
> have a copy 
> of the OCR standards, but I suspect those are mandated by these 
> standards. On the other hand, and I can't find traces of those in 
> Unicode, so I suspect they have been unified. But with which 
> characters? 
> More generally, are there other things like that we should aware of?
> 
> Thanks,
> Eric.
> 
> 
> 




Page-8-OCR-B.pdf
Description: Binary data


RE: UniCharacter (Re: Codes for codes for codes for... (RE: Chromatic font research))

2002-06-27 Thread Winkler, Arnold F

Folks, WAIT A BIT.  

This method, as tempting as it is, would make all text "not accessible" for
people with visual disabilities.  And, as you all know, Section 508 requires
that any electronic information from the government (e.g. web site) must be
accessible to people with disabilities.  

Here goes a great idea unless we find an accessible way to "display" colors
for the blind ! Assistive Technologies companies - here is your challenge
!!! 

Arnold

> -Original Message-
> From: Rick McGowan [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, June 27, 2002 1:12 PM
> To: [EMAIL PROTECTED]
> Subject: Re: UniCharacter (Re: Codes for codes for codes for... (RE:
> Chromatic font research))
> 
> 
> Tex wrote:
> 
> > Lends a whole new meaning to unification! The single 
> character encoding, 
> > UniCharacter!. Just color what you need.
> 
> Yeah! I like Tex's suggestion. It would eliminate all kinds 
> of problems.  
> We wouldn't have to worry about encoding anything ever again, 
> because users  
> would have all the tools they need to express whatever they 
> wanted just by  
> coloring in the bits! And nobody would have any problems decoding it!
> 
> The only question that remains is, "how much resolution is 
> enough"? I  
> think if we have 512x512 bytes for 256x256 resolution at 
> 16-bits/pixel for  
> color, that ought to be enough resolution to satisfy anyone. So each  
> character would only require 2,097,152 bits. With all the 
> fancy compression  
> schemes we could cook up, that shouldn't pose any difficulty 
> at all. And  
> it really ought to appeal to the RAM manufacturers...
> 
> Speaking of compression schemes, we could pick a space of say 
> 32 bits and  
> allow people to register the characters they like by NUMBER 
> (!), and we  
> could keep a whole technical committee engrossed for decades 
> in deciding  
> which proposed pictures were really the same and thus have 
> "already been  
> registered", and numbering things, then we could transmit 
> information  
> compactly by using the catalog numbers instead of the 
> pictures. That might  
> be helpful to users, I'm not sure...
> 
>   Rick
> 
> 
> 




Re: [OT] Re: The exact birthday of French: 0842-02-14

2002-03-28 Thread Winkler, Arnold F

In this thread, the name "Illig" has been mentioned a few times.  Here is
some information about his book(s) on the subject:

Heribert Illig :  "Wer hat an der Uhr gedreht ?" (Wie 300 Jahre Mittelalter
erfunden wurden)  ISBN 3-612-26561-X, ECON Verlag
This book is in German language, I have not seen a translation.

Illig has also written an earlier book, called "Das erfundene Mittelalter".
Its very first edition was called "Karl der Fiktive, genannt Karl der
Große".

Historic articles appeared also in publications such as
"Vorzeit-Frühzeit-Gegenwart" and "Zeitensprünge"

Keep up this discussion, I am enjoying it.

Arnold F. Winkler 
Internationalization Evangelist
Tel: 610-648-2055, NET-385-2055
Fax:  610-695-5473
E-mail:  [EMAIL PROTECTED]





FW: Bar codes using unicode

2002-02-06 Thread Winkler, Arnold F

Found that somewhat old e-mail from Clive, but the web site is still there
...
Good luck  
Arnold

-Original Message-
From: Hohberger, Clive [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 11, 2001 5:34 AM
To: '[EMAIL PROTECTED]'
Subject: Bar codes using unicode


Speaking as a member of the AIM bar code standards committee, there are new
two bar codes which support Unicode.

93i (designed by Sprague Ackey of Intermec) is a linear, error-correcting
barcode has issue as an AIM International Technical Standard, and it encodes
Unicode 2.0/2.1. For an overview, see:
http://www.aimglobal.org/standards/symbinfo/93i_overview.htm

Ultracode(r) and Color Ultracode (designed by me; Zebra Technologies
Corporation) are 2-dimensional error-correcting symbologies in the AIM
standards process. The Ultracode symbology is a constant-height, variable
length two-dimensional "linear matrix" using 9-cell high x 2-cell wide tiles
containing 283 different values (orignally was 47).  Ultracode can encode
either 8-bit, multi-byte or the full 21-bit Unicode 3-series character sets.
Because of the unique way in which characters are encoded, there is little
difference in symbol length when either 8-bit or Unicode encoding is used
with either Latin or non-Latin characters such as Chinese, Japanese and
Korean. UTF-8 is the default input/output. Black & white Ultracode is
scheduled for completion this year... Color Ultracode in 2002.

Anyone wishing a copy of the current Ultracode draft spec should contact me
offline ([EMAIL PROTECTED]) 
Clive




Shape of the US Dollar Sign

2001-09-28 Thread Winkler, Arnold F

Friends,

I got a request, I can't answer, but I am sure, one of you knows all about
it:

> 
> I'm in the middle of a research for my Commercial Laws
> IV subject, and I need to know what's the official US
> dollar sign: the s cross by one or two vertical lines?
> is there any law that says so?
> 

Thanks for your help

Arnold F. Winkler 
Internationalization Evangelist
Tel: 610-648-2055, NET-385-2055
Fax:  610-695-5473
E-mail:  [EMAIL PROTECTED]





RE: Limbu script proposal

2001-05-22 Thread Winkler, Arnold F

Folks,

Michael Everson has produced a consolidated document N2339 on Limbu during
the WG2 meeting - I don't have an electronic copy of the document, but I am
sure, we can get it.  It is essentially a combination of the 3 L2 documents.


Arnold

-Original Message-
From: Asmus Freytag [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, May 22, 2001 1:27 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Limbu script proposal


Mike,

Unicode would like to submit the following script proposals with the intent 
for WG2 to consider them at the Singapore meeting for a future amendment to 
ISO/IEC 10646-1.

You will find the softcopy of the documents on the L2 home page. The 
documents are
L2 001-137 Proposal for Encoding of the Limbu Script
L2 001-138 Summary proposal form (Limbu)
L2 001-139 Printed samples of Limbu

Please follow up with Rick or Ken if you have issues with any of the
contents.

A./

Asmus Freytag
Unicode Liaison to WG2





RE: The Unicode Standard, Version 4.0

2001-04-01 Thread Winkler, Arnold F

Made it UTC/L2 document L2/01-140.  Lisa, can you please put it on the
agenda for the next meeting in Pleasanton.  We might even have some
proposals by then.  

Regards
Arnold

-Original Message-
From: Mark D.K. Whistler [mailto:[EMAIL PROTECTED]]
Sent: Sunday, April 01, 2001 8:52 AM
To: [EMAIL PROTECTED]
Subject: The Unicode Standard, Version 4.0


We are pleased to announce the release of The Unicode Standard, Version 4.0.
The character repertoire in Unicode 4.0 is so far identical to that of
Unicode 3.0.1, but it will soon increase thanks to *your* help.

The primary feature of Unicode 4.0 is that the addition of new code points
has now been entirely *liberalized*. The Unicode Technical Committee,
jointly with all involved national bodies, has verified that
all characters that are likely to be needed for use on computers in the next
millennium have already been added with Version 3.0.1.

In a recent e-mail to this mailing list, a respected official of The Unicode
Consortium wrote: "In other words, unless someone manages to wrest the
standard away  from the two committees and puts up a public website with an
'Encode Your Character Here For Free and Enter Our Sweepstakes!' interface,
I'm not going to worry about 'precious codespace' and neither should anybody
else."

Well, now we can publicly announce that the words of that official were just
an anticipation of the decision that the UTC was in process of making.
Unicode has now so many empty slots that everybody can get one (or more) and
encode whatever
he or she wishes!

Open the form attached to this mail and be the first one to take advantage
of the new mechanism to propose Unicode characters.


__
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup