On 07/10/2008 12:24:53 AM, Alan Altmark wrote:
> On Wednesday, 07/09/2008 at 07:33 EDT, Douglas Wooster/Raleigh/[EMAIL 
> PROTECTED]
> wrote:
> > Added one more line separator to the list below.  It usually burns me
> > when I use iconv.
> >
> :
> > NEL   - New/Next Line (ASCII x'85').  May see this when EBCDIC
> > data is translated to ASCII, as with iconv.
>
> NEL?  Not in any codepage I've ever heard of.  All ASCII control
> controls will be 0x01-0x1F.

OK, technically, the ASCII control codes are x'00'-x'1F' and x'7F',
because "ASCII", per se, uses only 7 bits.  But most ASCII-like
codepages are 8-bit, and the term "ASCII" gets applied to them in
common usage.  Control code NEL, x'85', is documented in SA22-7209-01
(ESA/390 Ref. Sum.), page 49, column "ISO-8", and in the official
Unicode standard at http://www.unicode.org/charts/PDF/U0080.pdf (code
chart on page two says 0085 is NEL, description on page three says
"<control> = NEXT LINE (NEL)").  The IBM refsum documents control
codes in the "ISO-8" column for most of the range x'7F'-x'9F'.
Unicode.org documents control codes for the entire range of x'80' -
x'9F' and calls them "C1 controls.  Alias names are those for ISO/IEC
6429:1992."  In http://www.unicode.org/charts/PDF/U0000.pdf ,
Unicode.org calls x'7F' "<control> = DELETE".  Unfortunately, iso.org
wants CHF 64,00 (I don't know what that is in dollars) for a PDF of
the actual ISO8859-1 standard (IBM's AFP Fonts Technical Reference
for Code Pages, S544-3802-02, is an excellent reference for printable
characters, but never documents control codes).  In other searches
for ISO8859-1-with-control-codes, I was referred to ISO6429 for
ISO8859-1 control code definitions, and I found some links (e.g.
http://www.itscj.ipsj.or.jp/ISO-IR/077.pdf ) describing ISO6429 as
defining "C1" control codes in the ranges x'84' - x'97' and x'9B' -
x'9F'.

> In order to ensure fidelity on a round trip
> (EBCDIC->ASCII->EBCDIC), what you will find is that EBCDIC values that
> have no ASCII equivalent will be assigned to ASCII codepoints that have
no
> EBCDIC equivalent.
>
> For example, 0x85 in Windows is an ellipsis, which doesn't exist in any
> EBCDIC code page.  In UNIX (ISO 8859-1) and its derivatives, it has no
> meaning.

It has no printable character assigned, but since the low part of Unicode
copies ISO8859-1, I think x'85' does have a defined meaning in ISO8859-1.

> And since NL doesn't exist in ASCII, 0x15 and 0x85 can be safely
> transposed.
>
> If you found an EBCDIC codepage that *did* have an ellipsis, then the
> translation for that particular codepage to Windows (1252) would have to
> find some other ASCII code point to contain the NL.
>
> Alan Altmark
> z/VM Development
> IBM Endicott

Douglas Wooster

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to