On 10/6/2015 5:24 AM, Sean Leonard
wrote:
And, why did Unicode deem it necessary to replicate the C1 block at 0x80-0x9F, when all of the control characters (codes) were equally reachable via ESC 4/0 - 5/15? I understand why it is desirable to align U+0000 - U+007F with ASCII, and maybe even U+0000 - U+00FF with Latin-1 (ISO-8859-1). But maybe Windows-1252, MacRoman, and all the other non-ISO-standardized 8-bit encodings got this much right: duplicating control codes is basically a waste of very precious character code real estate Because Unicode aligns with ISO 8859-1, so that transcoding from that was a simple zero-fill to 16 bits. 8859-1 was the most widely used single byte (full 8-bit) ISO standard at the time, and making that transition easy was beneficial, both practically and politically. Vendor standards all disagreed on the upper range, and it would not have been feasible to single out any of them. Nobody wanted to follow the IBM code page 437 (then still the most widely used single byte vendor standard). Note, that by "then" I refer to dates earlier than the dates of the final drafts, because may of those decisions date back to earlier periods where the drafts were first developed. Also, the overloading of 0x80-0xFF by Windows did not happen all at once, earlier versions had left much of that space open, but then people realized that as long as you were still limited to 8 bits, throwing away 32 codes was an issue. Now, for Unicode, 32 out of 64K values (initially) or 1114112 (now), don't matter, so being "clean" didn't cost much. (Note that even for UTF-8, there's no special benefit of a value being inside that second range of 128 codes. Finally, even if the range had not been dedicated to C1, the 32 codes would have had to be given space, because the translation into ESC sequences is not universal, so, in transcoding data you needed to have a way to retain the difference between the raw code and the ESC sequence, or your round-trip would not be lossless. A./ |
- Re: Acquiring DIS 10646 Doug Ewell
- Re: Acquiring DIS 10646 Janusz S. Bien
- Re: Acquiring DIS 10646 Sean Leonard
- Re: Acquiring DIS 10646 Doug Ewell
- Why Nothing Ever Goes Away (was: Re: Acquiring DIS... Ken Whistler
- Re: Why Nothing Ever Goes Away (was: Re: Acqui... Philippe Verdy
- Re: Why Nothing Ever Goes Away (was: Re: A... Philippe Verdy
- Re: Why Nothing Ever Goes Away Sean Leonard
- Re: Why Nothing Ever Goes Away Philippe Verdy
- Re: Why Nothing Ever Goes Awa... Richard Wordingham
- Re: Why Nothing Ever Goes Away Asmus Freytag (t)
- Re: Why Nothing Ever Goes Awa... Sean Leonard
- Re: Why Nothing Ever Goes Away Sean Leonard