On Tue, 7 Feb 2012 09:42:05 -0600, Paul Gilmartin <paulgboul...@aim.com> wrote:
>And, speaking of standards, this is a conspicuous violation by z/OS.
>You know CMS.  CMS Pipelines correctly translates:
>
>    IBM-1047  ISO8859-1
>
>    NL 0x15    NL 0x85
>    LF  0x25    LF 0x0a
>
>iconv(1) on Ubuntu Linux correctly does likewise.  (What do the
>various Linuxen for z do?)
>
>iconv(1) on z/OS does:
>
>    IBM-1047  ISO8859-1
>
>    NL 0x15    LF 0x0a
>    LF  0x25    NL 0x85

The IBM Globalization Center of Competence possesses the One Ring that rules 
all IBM code pages.  They provide the code pages and the translations.  The 
GCoC says EBCDIC 0x15 in cp1047 should be translated to 0x85 in cp819 (8859-1), 
as you say.   (The translation table 10470819 on z/VM gives you the 
GCoC-defined translation.)   In the traditional System z world, we store text 
files as records.  In that world, the 0x15 has significance only in device 
drivers and then, typically only to SCS printers.  z/VM keeps the history 
alive, but I digress....

The rub is, of course, POSIX.  When in a POSIX frame of mind, the 0x15 again 
has significance, being the IBM-chosen value for the <newline> required by the 
POSIX standard.   Now and the POSIX translation rules apply.   It's an inherent 
fugue state.   IMO, if you use iconv in z/OS outside of USS and explicitly tell 
it IBM-1047 and IBM-819, it should convert it as you describe since to do 
otherwise destroys the ability of other platforms to reliably translate the 
file back to code page 1047.

>I have a similar problem with the misbehavior of LC_COLLATE=En_US
>in z/OS LE.  IBM is trying to tell me it's an ASCII vs. EBCDIC problem.

>From a POSIX perspective, collation order should be the same on all platforms. 
> The characters appear in a defined order, without regard to the 
>platform-specific code point assigned to the characters.  For example, both 
>En_US and the POSIX locales sort numbers, then upper case, then lower case.   
>This is consistent with the byte sort order in ASCII.  They have different 
>sort orders for the control characters, and POSIX only deals with 7 bits.

Alan Altmark
IBM

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Reply via email to