On Mon, 4 Sep 2006 17:03:58 +0200, Chris Mason <[EMAIL PROTECTED]> wrote:
> 
> Now, I've just spotted something odd which may be the reason for your post.
> In the "SNA Formats" manual, GA27-3136-20[2], in Appendix A, "SNA Character
> Sets and Symbol-String Types", which you'd expect to be the model of
> clarity, X'15' is identified as Line Feed - confusion worse confounded. Just
> ignore it. It's wrong. I wonder if there is anyone still reading who has
> access to anyone who might know the authors who can justify this nonsense.
> What I described above is what I have taught and shown to work in
> environments such as Unformatted Systems Services in VTAM, with both "pure"
> SNA devices and, more importantly, with devices which require an EBCDIC
> translation of ASCII characters.
> 
Bad History.  UNIX in ASCII environments customarily uses LF as a single
character abbreviation for the NL function.  And the "C" language standard
presumes a single character, represented by "\n" in character and string
denotations with the NL function.  There's likely a causal relation; I don't
know which convention claims primacy.

Meanwhile, the EBCDIC 0x15 code point has long been used for serial device
control (3215?), and as a command separator by VM CP.  Pragmatically, when
"C" compilers (from independents, not IBM) first appeared for 370 computers,'
they implemented "\n" as 0x15.  IBM's "C" compiler went along with the
crowd.  And when z/OS Unix Services appeared, it followed the trend.
And to support conversion, IBM supplied a translation table, OEMVS311,
which translates ASCII LF (0x0A) to EBCDIC 0x15 and ASCII NEL (0x85) to
EBCDIC 0x25 to preserve revertibility.

I agree with the behavior implemented by all these decisions; it's pragmatic
and convenient.  I disagree at the point where OEMVS311 is named a conversion
from ISO8859-1 to IBM-1047.  It isn't, because of the exception you observed.
And non-z/OS programmers (e.g. Linux) attempting to implement conversion between
ISO8859-1 and IBM-1047 by the book overlook obscure footnotes and appendices
which describe the exception.  I believe the Right Thing to do is to preserve
the useful behavior of OEMVS311 while making the documentation correct by
defining a new EBCDIC code page with LF at 0x15 and NEL at 0x25, the de facto
convention.  EBCDIC dogmatists disagree, saying that in such a code page
programmers could no longer rely on NL to perform its historic function (in
fact no such character would be defined at all) on hardware devices (are any
such still in use?), or that a code page which differed from all other EBCDIC
pages in the assignment of any code point below 0x40 can't be called "EBCDIC".

I remain unswayed; I see the need for a code page definition which matches
the translation that OEMVS311 applies to ISO8859-1, but with the names
properly corresponding.  Don't even call it EBCDIC, if that violates your
dogma, but make it available, and make it the nominal definition of the
OEMVS311 conversion.

BTW, VM/CMS facilities converting ISO8859-1 to IBM-1047 generally map
ASCII LF to 0x25 and ASCII NEL to 0x15, thereby introducing an incompatibility
with the prevailing z/OS usage of OEMVS311 as that nominal translation.

-- gil
-- 
StorageTek
INFORMATION made POWERFUL

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to