John, if you are saying that there are some Unicode characters that cannot
be represented in UTF-8 then that is incorrect. *Any* Unicode character --
pretty much any character in the world -- may be represented in UTF-8. For
external representations of Unicode the battle is pretty much over and UTF-8
won.

Charles

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of John Gilmore
Sent: Friday, January 10, 2014 6:51 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Subject Unicode

I have refrained from saying anything about this topic because I judged that
anything I said would be predictable.  I am a well-known offender, a
flagrant Unicode, i.e., minimally UTF-16, advocate.

Now, however, Charles Mills has pushed me into posting something.   He
writes

<begin extract>
That is called UTF-16. Pretty good but still not very efficient.
</end extract>

As usual, it depends.  If one's problems are always with a single pair of
natural languages, one of which is English (ENG or ENU), which makes little
use of orthographically marked letters, a satisfactory
UTF-8 'solution' may be, indeed usually is, possible.

Something can, that is, be done in a UTF-8 framework with such languiage
pairs as

o English and French.

o English and German, or even

o English and Polish.

As soon, however, as you need to support

o three or more different  roman-alphabet natural languages, or

o a roman-alphabet language and a non-alphabetic Asian language

you need UTF-16.

To put the matter more brutally, any new system being built today and in
particular any new system that is likely to interact, at whatever remove,
with web-based systems should use UTF-16.

The notion that the only efficient representation for character data is an
SBCS one is retrograde at best.  Continuing with it will make trouble for
those who do so; worse, it will ensure that the systems they build are
short-lived.  The ASCII vs EBCDIC dispute is no longer of much interest.
They are both obsolescent, usable safely only in what the international
lawyers call municipal contexts.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to