I did not "miss" the word defined. It's not in IBM's definition. I hear everyone who is saying "the term 'code point' *really* means a bit value with a glyph assigned to it," but that's not what the definitions out there say. Wikipedia:
In character encoding terminology, a code point or code position is any of the numerical values that make up the code space (or code page).[1] For example, ASCII comprises 128 code points in the range 0hex to 7Fhex, Extended ASCII comprises 256 code points in the range 0hex to FFhex, and Unicode comprises 1,114,112 code points in the range 0hex to 10FFFFhex. The Unicode code space is divided into seventeen planes (the basic multilingual plane, and 16 supplementary planes), each with 65,536 (= 216) code points. Thus the total size of the Unicode code space is 17 × 65,536 = 1,114,112. "ASCII comprises 128 code points." Not all 128 of those have glyphs. "Unicode comprises 1,114,112 code points." Not all of those million-plus code points have glyphs. "every code point in the source CCSID maps to a unique code point in the target CCSID" (note no "defined") That says every one of the 128 ASCII code points maps to a unique bit combination. But it ain't so. If one is going to define "round trip conversion" as applying only to corresponding glyphs then the definition loses any meaning. Any rational translation is round trip with regard to corresponding glyphs. Charles -----Original Message----- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Mike Schwab Sent: Wednesday, June 13, 2012 12:02 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Anyone a Unicode Services expert? -- roundtrip conversion On Wed, Jun 13, 2012 at 12:59 PM, Charles Mills <charl...@mcn.org> wrote: > I got a response to the PMR. Taking the liberty of paraphrasing a long > reply, the essence of it seemed to be that -- per the CCSID pair lists > in the manual -- they support round trip conversion from 1027 to 1208 > but not from 1208 to 1027. Here is what I wrote back: > What this means is: Roundtrip example: Every defined character in 1027, excluding values that do not have a character defined, exist in 1208, is successfully translated from 1027 to 1208 and back to 1027. All codepoints that do not have a character defined will be translated to (one?) non-valid value. Non-roundtrip example: Some defined characters in 1208, and all codepoints that do not have a character defined, do not exist in 1027. If you translate text from 1208 to 1027, the characters not defined in 1027, and all undefined codepoints will be translated to (one?) non-valid codepoint. In any translation, a codepoint that does not encode a character will be translated to (one?) non-valid codepoint. If, while the text is in 1208 is changed to add a character not in 1027, upon translation that value will be changed to an invalid value in 1027. > It sounds like you are saying (for the CCSIDs in question) "we support > round trip, but in one direction only." It would be like if I bought a > round trip ticket on Delta between San Francisco and Atlanta, and > after I got to Atlanta, they explained that it was a round trip ticket > only in one direction. > > I would kind of question also whether what you are doing conforms to > your definition of round trip in the Unicode manual glossary: Round > trip. Encoding that occurs when every code point in the source CCSID > maps to a unique code point in the target CCSID. You missed the word defined. If a codepoint does not have a defined character, it is translated to (one?) invalid value. > Using round trip > tables ensure the capability of reversing the conversion, and > recovering the complete original source datastream. > > I would question "every code point in the source CCSID maps to a > unique code point in the target CCSID" when both 3F and 41 map to the > same code point, and I wonder how I would recover the original source > datastream. > Again, you missed the word defined. If a codepoint does not have a defined character, it is translated to (one?) invalid value. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN