RE: Names for UTF-8 with and without BOM

Joseph Boyle Sat, 02 Nov 2002 10:28:29 -0800

The first time I thought of UTF-8Y it sounded too flippant, but actually it
is fairly self-explanatory if UTF-8 is taken as a given, and has the virtue
of being short.


UTF-8S for signature would also make sense, but is the same as the name of
Toby Phipps's proposal which eventually became CESU-8.

UTF-8J will certainly make sense, after UTC changes all the character names
to Esperanto, conducts its meetings in Esperanto, and publishes TUS in
Esperanto.

If we want to be really explicit, there's UTF-8EFBBBF.

-----Original Message-----
From: William Overington [mailto:WOverington@;ngo.globalnet.co.uk] 
Sent: Friday, November 01, 2002 10:37 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Names for UTF-8 with and without BOM


As you have UTF-8N where the N stands for the word "no" one could possibly
have UTF-8Y where the Y stands for the word "yes".

Thus one could have the name of the format answering, or not answering, the
following question.

Is there a BOM encoded?

However, using the letter Y has three disadvantages for widespread use.  The
letter Y could be confused with the word "why", the word "yes" is English,
so the designation would be anglocentric, and the letter Y sorts
alphabetically after the letter N.

However, if one considers the use of the international language Esperanto,
then the N would mean "ne", that is, the Esperanto word for "no" and thus
one could use the letter J to stand for the Esperanto word "jes" which is
the Esperanto word for "yes" and which, in fact, is pronounced exactly the
same as the English word "yes".

Thus, I suggest that the three formats could be UTF-8, UTF-8J and UTF-8N,
which would solve the problem in a manner which, being based upon a neutral
language, will hopefully be acceptable to all.

William Overington

2 November 2002

RE: Names for UTF-8 with and without BOM

Reply via email to