C was not the first language to use []. ________________________________________ From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> on behalf of Jonathan Scott <jonathan_sc...@vnet.ibm.com> Sent: Monday, February 13, 2023 1:35 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: HLASM code page support enhancements
Ze'ev Atlas wrote: > The real question is why, but really why, IBM had to introduce > this EBCDIC horror, where symbols like [,], ^ and some less > signifacant ones moved around dry leaves in the fall wind. That's a bit off-topic, but the answer is "Compatibility". Old physical printers and terminal had a limited number of different characters which they could print. To make it possible to print national characters, they simply provided an alternative physical set of printable characters for the same internal codes (for example print chain or train for a line printer, or golf ball or daisy wheel for a typewriter-style terminal). This occurred both for EBCDIC and ASCII computer systems. The same code mappings were used for national language characters in subsequent devices such as display terminals and matrix printers. When computers eventually started to be linked up across international boundaries using networks, this obviously ran into problems. ASCII systems were dominated by the use of personal computers which outside the USA mostly ended up using IBM's code page 850 then moving to a mapping originally known as ECMA-94 which eventually evolved into ISO 8859-1, similar to the commonest Windows code page, which became the first 256 bytes of Unicode. EBCDIC systems for Western languages for the 3278/9 implemented Country Extended Code Pages (CECP), based on the same Latin-1 character set as ISO 8859-1, allowing a 1-to-1 mapping, but for compatibility each national EBCDIC variant still uses the same codes as were used for the early physical printers and terminal, requiring code page conversion to map information for terminals with different code pages. One of these code pages, international code page 500, was specifically designed to map the characters of US ASCII to the code points supported by physical EBCDIC printers, so for example square brackets were included as printable, replacing some less common EBCDIC characters. However, apart from this special case, square brackets and braces were rarely used on mainframes (as the C language was also rarely used) so they were not normally considered to be printable characters (except when using the TN print train for text documents), so apart from code page 500 their code points were simply assigned to otherwise unused non-printable positions rather than being consistently assigned across code pages. The C language on the mainframe adopted the same codes for square brackets as the TN text print train, and a mapping which combines most of US EBCDIC 37 with the C characters from Text mode was defined as what is now Open Systems code page 1047, used in the mainframe Unix environment. (Unfortunately that code page swaps the EBCDIC "not" sign with the "caret" symbol which still causes a lot of confusion). The official solution for international applications was to use GDDM to convert text data automatically between the application or data code pages and the terminal code page. This method is still usable, but it is now more usual for mainframe programs to use separately defined text resources which are stored in the right code page for the target system. Jonathan Scott, HLASM IBM Hursley, UK