On 07/10/2008 12:24:53 AM, Alan Altmark wrote: > On Wednesday, 07/09/2008 at 07:33 EDT, Douglas Wooster/Raleigh/[EMAIL > PROTECTED] > wrote: > > Added one more line separator to the list below. It usually burns me > > when I use iconv. > > > : > > NEL - New/Next Line (ASCII x'85'). May see this when EBCDIC > > data is translated to ASCII, as with iconv. > > NEL? Not in any codepage I've ever heard of. All ASCII control > controls will be 0x01-0x1F.
OK, technically, the ASCII control codes are x'00'-x'1F' and x'7F', because "ASCII", per se, uses only 7 bits. But most ASCII-like codepages are 8-bit, and the term "ASCII" gets applied to them in common usage. Control code NEL, x'85', is documented in SA22-7209-01 (ESA/390 Ref. Sum.), page 49, column "ISO-8", and in the official Unicode standard at http://www.unicode.org/charts/PDF/U0080.pdf (code chart on page two says 0085 is NEL, description on page three says "<control> = NEXT LINE (NEL)"). The IBM refsum documents control codes in the "ISO-8" column for most of the range x'7F'-x'9F'. Unicode.org documents control codes for the entire range of x'80' - x'9F' and calls them "C1 controls. Alias names are those for ISO/IEC 6429:1992." In http://www.unicode.org/charts/PDF/U0000.pdf , Unicode.org calls x'7F' "<control> = DELETE". Unfortunately, iso.org wants CHF 64,00 (I don't know what that is in dollars) for a PDF of the actual ISO8859-1 standard (IBM's AFP Fonts Technical Reference for Code Pages, S544-3802-02, is an excellent reference for printable characters, but never documents control codes). In other searches for ISO8859-1-with-control-codes, I was referred to ISO6429 for ISO8859-1 control code definitions, and I found some links (e.g. http://www.itscj.ipsj.or.jp/ISO-IR/077.pdf ) describing ISO6429 as defining "C1" control codes in the ranges x'84' - x'97' and x'9B' - x'9F'. > In order to ensure fidelity on a round trip > (EBCDIC->ASCII->EBCDIC), what you will find is that EBCDIC values that > have no ASCII equivalent will be assigned to ASCII codepoints that have no > EBCDIC equivalent. > > For example, 0x85 in Windows is an ellipsis, which doesn't exist in any > EBCDIC code page. In UNIX (ISO 8859-1) and its derivatives, it has no > meaning. It has no printable character assigned, but since the low part of Unicode copies ISO8859-1, I think x'85' does have a defined meaning in ISO8859-1. > And since NL doesn't exist in ASCII, 0x15 and 0x85 can be safely > transposed. > > If you found an EBCDIC codepage that *did* have an ellipsis, then the > translation for that particular codepage to Windows (1252) would have to > find some other ASCII code point to contain the NL. > > Alan Altmark > z/VM Development > IBM Endicott Douglas Wooster ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390