Re: Why Nothing Ever Goes Away

Sean Leonard Tue, 06 Oct 2015 05:30:33 -0700

2. The Unicode code charts are (deliberately) vague about U+0080, U+0081,
and U+0099. All other C1 control codes have aliases to the ISO 6429

set of control functions, but in ISO 6429, those three control codesdon't

have any assigned functions (or names).


On 10/5/2015 3:57 PM, Philippe Verdy wrote:

Also the aliases for C1 controls were formally registered in 1983 onlyfor the two ranges U+0084..U+0097 and U+009B..U+009F for ISO 6429.


If I may, I would appreciate another history lesson:

In ISO 2022 / 6429 land, it is apparent that the C1 controls are mainlyaliases for ESC 4/0 - 5/15. ( @ through _ ) This might vary depending onwhat is loaded into the C1 register, but overall, it just seems likesaving one byte.


Why was C1 invented in the first place?

And, why did Unicode deem it necessary to replicate the C1 block at0x80-0x9F, when all of the control characters (codes) were equallyreachable via ESC 4/0 - 5/15? I understand why it is desirable to alignU+0000 - U+007F with ASCII, and maybe even U+0000 - U+00FF with Latin-1(ISO-8859-1). But maybe Windows-1252, MacRoman, and all the othernon-ISO-standardized 8-bit encodings got this much right: duplicatingcontrol codes is basically a waste of very precious character code realestate.


Sean

PS I was not able to turn up ISO 6429:1983, but I did find ECMA-48, 4thEd., December 1986, which has the following text:

***
5.4 Elements of the C1 Set
These control functions are represented:

- In a 7-bit code by 2-character escape sequences of the form ESC Fe,where ESC is represented by bit combination 01/11 and Fe is representedby a bit combination from 04/00 to 05/15.

- In an 8-bit code by bit combinations from 08/00 to 09/15.
***

This text is seemingly repeated in many analogous standards ca. ~1974 -~1992.

PPS I happen to have a copy of ANSI X3.41-1974 "American NationalStandard Code Extension Techniques for Use with the 7-Bit CodedCharacter Set of [ASCII]". The invention/existence of C1 goes back tothis time, as does the use of ESC Fe to invoke C1 characters in a 7-bitcode, and 0x80-0x9F to invoke C1 characters in an 8-bit code. (See, inparticular, Clauses 5.3.3.1 and 5.3.6). In particular, Clause 7.3.1.2says: "The use of ESC Fe sequence in an 8-bit environment is contrary tothe intention of this standard but, should they occur, their meaning isthe same as in the 7-bit environment."

I can appreciate why it was desirable to "fold" C1 in an 8-bitenvironment into a 7-bit environment with ESC Fe. (If, in fact, that wasthe direction of standardization: invent a new thing and then devise acoding to express the new thing in the old thing.) It is less obviouswhy Unicode adopted C1, however, when the trend was to jettison the94-character Tetris block assignments in favor of a wide-open field forcharacter assignment. Except for the trend in Unicode to "avoidassigning characters when explicitly asked, unless someone implementsthem without asking, and the implementation catches on, and then justassign the whole lot of them, even when they overlap with existingassignments, and then invent composite characters, which furthercompound the possible overlapping combinations". 😉

Re: Why Nothing Ever Goes Away

Reply via email to