subject:"Re\: Unicode Services translation question"

Re: Unicode Services translation question

2012-05-23 Thread Charles Mills

I *think* I understand the problem. I am going to have to revisit the
documentation and then the code.

When I *re*-parametize the service with a different CCSID it is not
"taking." I have a "handle" or something that is not getting re-initialized.
I just started the program over with 01047/01252 and I get the results
expected: 6A is going to A6 and B0 is going to AC.

In earlier tests I had been starting with 01047/01208 and then
re-configuring the STC. I am calling CUNLINFO and CUNLCNV with parms that
look right but it is using the original values from the first CUNLINFO call,
not the current call.

The above is not a very clear exposition. It has been a year or more since I
wrote this code. I am going to have to re-visit the documentation for
Unicode Services.

Thanks everyone for your help, especially Kirk.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Charles Mills
Sent: Wednesday, May 23, 2012 1:47 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Thanks.

> Could there be something wrong with how your Unicode Services tables 
> are
configured?

Sure, but I try to avoid "blame the compiler" and "blame the operating
system" for as long as possible!

I want to see where Walt was going with the > 7F question.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Kirk Wolf
Sent: Wednesday, May 23, 2012 1:39 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Charles,

I'm not sure what the problem is, but 1252 is a single byte charset, so it
seems wrong that you are getting multi-byte results.

I don't see any difference in how you are calling CUNLCNV I tried our
code with your technique string (we default to LMREC) :

> *showtrtab -s 1047 -t 1252 -q LMER *


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Could there be something wrong with how your Unicode Services tables are
configured?

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@bama.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Charles Mills

Here is what showtrtab shows (pretty much as expected: 6A->A6 and B0 -> AC).

$ ./showtrtab -s 1047 -t 1252  
00:  00 01 02 03   9C 09 86 7F   97 8D 8E 0B   0C 0D 0E 0F 
10:  10 11 12 13   9D 0A 08 87   18 19 92 8F   1C 1D 1E 1F 
20:  80 81 82 83   84 85 17 1B   88 89 8A 8B   8C 05 06 07 
30:  90 91 16 93   94 95 96 04   98 99 9A 9B   14 15 9E 1A 
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C 
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E 
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F 
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22 
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1 
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4 
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE 
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7 
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5 
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF 
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5 
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 9F

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Charles Mills
Sent: Wednesday, May 23, 2012 1:47 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Thanks.

> Could there be something wrong with how your Unicode Services tables are
configured?

Sure, but I try to avoid "blame the compiler" and "blame the operating
system" for as long as possible!

I want to see where Walt was going with the > 7F question.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Kirk Wolf
Sent: Wednesday, May 23, 2012 1:39 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Charles,

I'm not sure what the problem is, but 1252 is a single byte charset, so it
seems wrong that you are getting multi-byte results.

I don't see any difference in how you are calling CUNLCNV I tried our
code with your technique string (we default to LMREC) :

> *showtrtab -s 1047 -t 1252 -q LMER *


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Could there be something wrong with how your Unicode Services tables are
configured?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Charles Mills

Thanks.

> Could there be something wrong with how your Unicode Services tables are
configured?

Sure, but I try to avoid "blame the compiler" and "blame the operating
system" for as long as possible!

I want to see where Walt was going with the > 7F question.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Kirk Wolf
Sent: Wednesday, May 23, 2012 1:39 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Charles,

I'm not sure what the problem is, but 1252 is a single byte charset, so it
seems wrong that you are getting multi-byte results.

I don't see any difference in how you are calling CUNLCNV I tried our
code with your technique string (we default to LMREC) :

> *showtrtab -s 1047 -t 1252 -q LMER *


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Could there be something wrong with how your Unicode Services tables are
configured?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Kirk Wolf

Charles,

I'm not sure what the problem is, but 1252 is a single byte charset, so it
seems wrong that you are getting multi-byte results.

I don't see any difference in how you are calling CUNLCNV I tried our
code with your technique string (we default to LMREC) :

> *showtrtab -s 1047 -t 1252 -q LMER *


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Could there be something wrong with how your Unicode Services tables are
configured?

Kirk Wolf
Dovetailed Technologies
http://dovetail.com
+1 636.300.0901


On Wed, May 23, 2012 at 3:11 PM, Charles Mills  wrote:

> > Does it work as you expected for other characters in 1047 whose
> equivalent
> in 1252 have values above x"7F"?
>
> I just put in a broken vertical bar (EBCDIC 6A) and it translated
> (allegedly
> into 1252) as C2A6 rather than the expected A6.
>
> Where are you going with this? You obviously have something in mind.
>
> FWIW, here is more detail on the coding. Here is more of the setup:
>
>UniConvParms.Src_CCSID= Parms::XlateFrom;
>UniConvParms.Targ_CCSID   = Parms::XlateTo;
>
>UniConvParms.Flag1.Sub_Action  = '\x01';// Subsitute and
> continue
>UniConvParms.Flag1.Inv_Handle  = '\x01';// if invalid handle get
> a new one
>UniConvParms.Flag1.No_Opt_Buf_Fill = '\x01';// ???
>UniConvParms.Flag1.Mal_Action  = '\x01';// if malformed
> terminate with error
>UniConvParms.Flag1.RL_Sub_Action   = '\x01';// ???
>
>CUNLCNV(&UniConvParms);
>
> I am pretty confident of the values of Parms::XlateFrom and To because I
> have code that displays those same fields
>
>displayChildren("XLATE", "(%05d %05d '%s')", XlateFrom, XlateTo,
> XlateTechniques);
>
> (it's in the Parms class so the Parms:: is implicit)
>
> and the output is
>
> XLATE  (01047 01252 'LMER')
>
> Charles
> -Original Message-
> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On
> Behalf
> Of Walt Farrell
> Sent: Wednesday, May 23, 2012 10:19 AM
> To: IBM-MAIN@bama.ua.edu
> Subject: Re: Unicode Services translation question
>
> Does it work as you expected for other characters in 1047 whose equivalent
> in 1252 have values above x"7F"? Or is the not sign the only one that's
> mis-behaving?
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Charles Mills

> Does it work as you expected for other characters in 1047 whose equivalent
in 1252 have values above x"7F"?

I just put in a broken vertical bar (EBCDIC 6A) and it translated (allegedly
into 1252) as C2A6 rather than the expected A6.

Where are you going with this? You obviously have something in mind.

FWIW, here is more detail on the coding. Here is more of the setup:

UniConvParms.Src_CCSID= Parms::XlateFrom;
UniConvParms.Targ_CCSID   = Parms::XlateTo;
 
UniConvParms.Flag1.Sub_Action  = '\x01';// Subsitute and
continue
UniConvParms.Flag1.Inv_Handle  = '\x01';// if invalid handle get
a new one
UniConvParms.Flag1.No_Opt_Buf_Fill = '\x01';// ???
UniConvParms.Flag1.Mal_Action  = '\x01';// if malformed
terminate with error
UniConvParms.Flag1.RL_Sub_Action   = '\x01';// ???

CUNLCNV(&UniConvParms);

I am pretty confident of the values of Parms::XlateFrom and To because I
have code that displays those same fields

displayChildren("XLATE", "(%05d %05d '%s')", XlateFrom, XlateTo,
XlateTechniques);

(it's in the Parms class so the Parms:: is implicit)

and the output is

XLATE  (01047 01252 'LMER')

Charles
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Walt Farrell
Sent: Wednesday, May 23, 2012 10:19 AM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Does it work as you expected for other characters in 1047 whose equivalent
in 1252 have values above x"7F"? Or is the not sign the only one that's
mis-behaving?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Walt Farrell

Does it work as you expected for other characters in 1047 whose equivalent in 
1252 have values above x"7F"? Or is the not sign the only one that's 
mis-behaving?

-- 
Walt

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Charles Mills

> Are you sure that you are translating to 1252?

Mighty sure. Could I be confused? Of course. But it looks rock solid to me.
I have a lot of "display" facilities in the code and everything looks right
(except the output).

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Kirk Wolf
Sent: Wednesday, May 23, 2012 8:37 AM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Unicode Services translation question

Charles,

x'C2AC' is the logical not symbol in UTF-8.   Are you sure that you are
translating to 1252?

When I display the translate table for 1047->1252 using Unicode Services, it
appears to be single bye -> single byte:

Here is a dump using the "showtrtab" command (part of the free Co:Z
Toolkit) -

> *showtrtab  -s 1047 -t 1252*


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Kirk Wolf
Dovetailed Technologies
http://dovetail.com

FWIW: here is how showtrtab displays a one->many table:

>*showtrtab -s 1047 -t utf-8*
00:  00
01:  01
02:  02
03:  03
04:  C29C
05:  09
06:  C286
07:  7F
08:  C297
09:  C28D
0A:  C28E
0B:  0B
0C:  0C
...
AA:  C2A1
AB:  C2BF
AC:  C390
AD:  5B
AE:  C39E
AF:  C2AE
B0:  C2AC
B1:  C2A3
B2:  C2A5
B3:  C2B7
B4:  C2A9
B5:  C2A7
B6:  C2B6
B7:  C2BC
B8:  C2BD
B9:  C2BE
BA:  C39D
BB:  C2A8
BC:  C2AF
BD:  5D
BE:  C2B4
BF:  C397
...

Notice that B0 in 1047 translates to C2AC in UTF-8


On Wed, May 23, 2012 at 8:43 AM, Charles Mills  wrote:
> I don't understand what I am seeing from Unicode Services translation.
>
> I specify translation from 1047 (Encoding scheme 1100 - EBCDIC, SBCS; 
> Name LATIN 1 / OPEN SYSTEM) to 1252 (Encoding scheme 4105 - ASCII, 
> SBCS; Name MS-WIN LATIN-1).
>
> As both CCSIDs are SBCS I would expect that any "common" EBCDIC 
> character would get translated into a single ASCII byte. But for an 
> input byte of X'B0' (logical not in 1047) I am seeing translation to 
> the 2-byte sequence C2AC. AC is by my reading correct: it's 1252 
> logical not. But what the
heck
> is that C2 about (C2 is A with an acute accent in 1252).
>
> FWIW technique E, substitution 1A.
>
> Where am I confused?
>
> Charles

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread McKown, John

> -Original Message-
> From: IBM Mainframe Discussion List 
> [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Paul Gilmartin
> Sent: Wednesday, May 23, 2012 10:40 AM
> To: IBM-MAIN@bama.ua.edu
> Subject: Re: Unicode Services translation question

> 
> (John M. was lately ranting on another forum about the richness of the
> POSIX shell command structure compared to TSO. 

> 
> -- gil

Not "ranting", just a slight case of logorrhoea.  Certainly not 
Bloviating!

-- 
John McKown 
Systems Engineer IV
IT

Administrative Services Group

HealthMarkets(r)

9151 Boulevard 26 * N. Richland Hills * TX 76010
(817) 255-3225 phone * 
john.mck...@healthmarkets.com * www.HealthMarkets.com

Confidentiality Notice: This e-mail message may contain confidential or 
proprietary information. If you are not the intended recipient, please contact 
the sender by reply e-mail and destroy all copies of the original message. 
HealthMarkets(r) is the brand name for products underwritten and issued by the 
insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance 
Company(r), Mid-West National Life Insurance Company of TennesseeSM and The 
MEGA Life and Health Insurance Company.SM

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Paul Gilmartin

On Wed, 23 May 2012 06:43:53 -0700, Charles Mills wrote:

>I don't understand what I am seeing from Unicode Services translation.
>
>I specify translation from 1047 (Encoding scheme 1100 - EBCDIC, SBCS; Name
>LATIN 1 / OPEN SYSTEM) to 1252 (Encoding scheme 4105 - ASCII, SBCS; Name
>MS-WIN LATIN-1).
>
>As both CCSIDs are SBCS I would expect that any "common" EBCDIC character
>would get translated into a single ASCII byte. But for an input byte of
>X'B0' (logical not in 1047) I am seeing translation to the 2-byte sequence
>C2AC. AC is by my reading correct: it's 1252 logical not. But what the heck
>is that C2 about (C2 is A with an acute accent in 1252).
>
>FWIW technique E, substitution 1A.
> 
That appears to be a variable-length encoding, such as UTF-8.  For example:

387 $ awk 'BEGIN { printf( "%c", 16*11 ) }' |
iconv -f IBM-1047 -t UTF-8 | od -x
000 acc2

(John M. was lately ranting on another forum about the richness of the
POSIX shell command structure compared to TSO.  I can only agree.
What would be required to accomplish the same from the TSO "READY"
prompt?)

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

2012-05-23 Thread Kirk Wolf

Charles,

x'C2AC' is the logical not symbol in UTF-8.   Are you sure that you are
translating to 1252?

When I display the translate table for 1047->1252 using Unicode Services,
it appears to be single bye -> single byte:

Here is a dump using the "showtrtab" command (part of the free Co:Z
Toolkit) -

> *showtrtab  -s 1047 -t 1252*


00:  00 01 02 03   1A 09 1A 7F   1A 8D 8E 0B   0C 0D 0E 0F
10:  10 11 12 13   9D 0A 08 1A   18 19 1A 8F   1C 1D 1E 1F
20:  80 81 1A 1A   1A 1A 17 1B   1A 1A 1A 1A   1A 05 06 07
30:  90 1A 16 1A   1A 1A 1A 04   1A 1A 1A 1A   14 15 9E 1A
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 1A

Kirk Wolf
Dovetailed Technologies
http://dovetail.com

FWIW: here is how showtrtab displays a one->many table:

>*showtrtab -s 1047 -t utf-8*
00:  00
01:  01
02:  02
03:  03
04:  C29C
05:  09
06:  C286
07:  7F
08:  C297
09:  C28D
0A:  C28E
0B:  0B
0C:  0C
...
AA:  C2A1
AB:  C2BF
AC:  C390
AD:  5B
AE:  C39E
AF:  C2AE
B0:  C2AC
B1:  C2A3
B2:  C2A5
B3:  C2B7
B4:  C2A9
B5:  C2A7
B6:  C2B6
B7:  C2BC
B8:  C2BD
B9:  C2BE
BA:  C39D
BB:  C2A8
BC:  C2AF
BD:  5D
BE:  C2B4
BF:  C397
...

Notice that B0 in 1047 translates to C2AC in UTF-8


On Wed, May 23, 2012 at 8:43 AM, Charles Mills  wrote:
> I don't understand what I am seeing from Unicode Services translation.
>
> I specify translation from 1047 (Encoding scheme 1100 - EBCDIC, SBCS; Name
> LATIN 1 / OPEN SYSTEM) to 1252 (Encoding scheme 4105 - ASCII, SBCS; Name
> MS-WIN LATIN-1).
>
> As both CCSIDs are SBCS I would expect that any "common" EBCDIC character
> would get translated into a single ASCII byte. But for an input byte of
> X'B0' (logical not in 1047) I am seeing translation to the 2-byte sequence
> C2AC. AC is by my reading correct: it's 1252 logical not. But what the
heck
> is that C2 about (C2 is A with an acute accent in 1252).
>
> FWIW technique E, substitution 1A.
>
> Where am I confused?
>
> Charles
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

Re: Unicode Services translation question

10 matches

Site Navigation

Mail list logo

Footer information