C was not the first language to use [].

________________________________________
From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> on behalf 
of Jonathan Scott <jonathan_sc...@vnet.ibm.com>
Sent: Monday, February 13, 2023 1:35 PM
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Re: HLASM code page support enhancements

Ze'ev Atlas wrote:
> The real question is why, but really why, IBM had to introduce
> this EBCDIC horror, where symbols like [,], ^ and some less
> signifacant ones moved around dry leaves in the fall wind.

That's a bit off-topic, but the answer is "Compatibility".

Old physical printers and terminal had a limited number of
different characters which they could print.  To make it
possible to print national characters, they simply provided an
alternative physical set of printable characters for the same
internal codes (for example print chain or train for a line
printer, or golf ball or daisy wheel for a typewriter-style
terminal).

This occurred both for EBCDIC and ASCII computer systems.

The same code mappings were used for national language
characters in subsequent devices such as display terminals and
matrix printers.

When computers eventually started to be linked up across
international boundaries using networks, this obviously ran into
problems.

ASCII systems were dominated by the use of personal computers
which outside the USA mostly ended up using IBM's code page 850
then moving to a mapping originally known as ECMA-94 which
eventually evolved into ISO 8859-1, similar to the commonest
Windows code page, which became the first 256 bytes of Unicode.

EBCDIC systems for Western languages for the 3278/9 implemented
Country Extended Code Pages (CECP), based on the same Latin-1
character set as ISO 8859-1, allowing a 1-to-1 mapping, but for
compatibility each national EBCDIC variant still uses the same
codes as were used for the early physical printers and terminal,
requiring code page conversion to map information for terminals
with different code pages.

One of these code pages, international code page 500, was
specifically designed to map the characters of US ASCII to the
code points supported by physical EBCDIC printers, so for
example square brackets were included as printable, replacing
some less common EBCDIC characters.  However, apart from this
special case, square brackets and braces were rarely used on
mainframes (as the C language was also rarely used) so they were
not normally considered to be printable characters (except when
using the TN print train for text documents), so apart from code
page 500 their code points were simply assigned to otherwise
unused non-printable positions rather than being consistently
assigned across code pages.

The C language on the mainframe adopted the same codes for
square brackets as the TN text print train, and a mapping which
combines most of US EBCDIC 37 with the C characters from Text
mode was defined as what is now Open Systems code page 1047,
used in the mainframe Unix environment.  (Unfortunately that
code page swaps the EBCDIC "not" sign with the "caret" symbol
which still causes a lot of confusion).

The official solution for international applications was to use
GDDM to convert text data automatically between the application
or data code pages and the terminal code page.  This method is
still usable, but it is now more usual for mainframe programs to
use separately defined text resources which are stored in the
right code page for the target system.

Jonathan Scott, HLASM
IBM Hursley, UK

Reply via email to