John, if you are saying that there are some Unicode characters that cannot be represented in UTF-8 then that is incorrect. *Any* Unicode character -- pretty much any character in the world -- may be represented in UTF-8. For external representations of Unicode the battle is pretty much over and UTF-8 won.
Charles -----Original Message----- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of John Gilmore Sent: Friday, January 10, 2014 6:51 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: Subject Unicode I have refrained from saying anything about this topic because I judged that anything I said would be predictable. I am a well-known offender, a flagrant Unicode, i.e., minimally UTF-16, advocate. Now, however, Charles Mills has pushed me into posting something. He writes <begin extract> That is called UTF-16. Pretty good but still not very efficient. </end extract> As usual, it depends. If one's problems are always with a single pair of natural languages, one of which is English (ENG or ENU), which makes little use of orthographically marked letters, a satisfactory UTF-8 'solution' may be, indeed usually is, possible. Something can, that is, be done in a UTF-8 framework with such languiage pairs as o English and French. o English and German, or even o English and Polish. As soon, however, as you need to support o three or more different roman-alphabet natural languages, or o a roman-alphabet language and a non-alphabetic Asian language you need UTF-16. To put the matter more brutally, any new system being built today and in particular any new system that is likely to interact, at whatever remove, with web-based systems should use UTF-16. The notion that the only efficient representation for character data is an SBCS one is retrograde at best. Continuing with it will make trouble for those who do so; worse, it will ensure that the systems they build are short-lived. The ASCII vs EBCDIC dispute is no longer of much interest. They are both obsolescent, usable safely only in what the international lawyers call municipal contexts. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN