Jim Heifetz wrote:
(Caveat: I am using z/OS v1.7, so the files in question may not be the same.)
I see two problems with what you are trying to do.

1. The UTF-8 file is not case sensitive. Where IBM-1047 has "<a>" and "<A>", UTF-8 has "<LATIN_SMALL_LETTER_A>" and "<LATIN_CAPITAL_LETTER_A>". Without some intelligence in your regular expressions, you would have no matches at all. But you may need to pay attention to case in one file and not the other.

2. You mentioned characters such as thorn, where it appears that IBM-1047 has only one character and UTF-8 has two - both small and capital letters. I don't know if that means that the IBM-1047 thorn is lower case only, or if an implementor is free to choose what he wants. It appears that PComm has chosen to treat them as lower case.


David Bond at Tachyon software has some terrific pages
about codepages on their website. A lot of my current
work uses this page as a starting point:

  http://www.tachyonsoft.com/uc0000.htm#U00F0

Based on this page I see:


UCS-4: 000000DE LATIN CAPITAL LETTER THORN
Glyph: Þ
Lowercase: 00FE
UTF-8: C3 9E
GB-18030: 8130 8937
ASCII 8D: 00861
ASCII DE: 01252 ISO-8859-1 ISO-8859-10 ISO-8859-15
ASCII E8: 00850 00858
EBCDIC 4A: 00871 01149
EBCDIC AE: 00037 00273 00274 00275 00277 00278 00280 00281 00284 00285 00297 00500 00924 01005 01047 01140 01141 01142 01143 01144 01145 01146 01147 01148


and also:

UCS-4: 000000FE LATIN SMALL LETTER THORN
Glyph: þ
Uppercase: 00DE
UTF-8: C3 BE
GB-18030: 8130 8B36
ASCII 95: 00861
ASCII E7: 00850 00858
ASCII FE: 01252 ISO-8859-1 ISO-8859-10 ISO-8859-15
EBCDIC 8E: 00037 00273 00274 00275 00277 00278 00280 00281 00284 00285 00297 00500 00924 01005 01047 01140 01141 01142 01143 01144 01145 01146 01147 01148
EBCDIC C0: 00871 01149

So it seems 1047 supports both upper case and lower case thorn.
Well, ya' never know! :-)



--

Kind regards,

-Steve Comstock
The Trainer's Friend, Inc.

303-393-8716
http://www.trainersfriend.com

* z/OS application programmer training
  + Instructor-led on-site classroom based classes
  + Course materials licensing
  + Remote contact training
  + Roadshows
  + Course development

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to