On 13 June 2012 15:02, Mike Schwab <mike.a.sch...@gmail.com> wrote:

> Roundtrip example:  Every defined character in 1027, excluding values
> that do not have a character defined, exist in 1208, is successfully
> translated from 1027 to 1208 and back to 1027.  All codepoints that do
> not have a character defined will be translated to (one?) non-valid
> value.
>
> Non-roundtrip example:  Some defined characters in 1208, and all
> codepoints that do not have a character defined, do not exist in 1027.
>  If you translate text from 1208 to 1027, the characters not defined
> in 1027, and all undefined codepoints will be translated to (one?)
> non-valid codepoint.
>
> In any translation, a codepoint that does not encode a character will
> be translated to (one?) non-valid codepoint.

Not in *any* translation. IBM does define this kind of translation,
but it isn't "round trippable" - it's "enforced subset match".

>From the CDRA book: "The enforced subset match criterion guarantees
the preservation of the subset of characters that are common to both
the input and output character sets. Any character not in this common
subset will be replaced with a unique character that indicates that a
substitution has occurred." They go on to recommend use of X'1A' in
ASCII-ish environments, and X'3F' in EBCDIC.

> You missed the word defined.  If a codepoint does not have a defined
> character, it is translated to (one?) invalid value.

I can't make it parse that way.

"CDRA defines only the graphic character data conversion part of the
overall data conversion process. A limited number of control
characters are addressed as part of handling different string types
(see "Types of Strings") and as part of control character mappings
(see "Pairings of Code Points"). Other control characters are treated
as bytes, and are dealt with according to mismatch management
criteria."

and under "Criteria for Character Set Mismatch Management", describing
"Pairing of Code Points Using Round Trip":

 - An input graphic code point outside the common set is mapped to an
output graphic code point outside the common set
 - An input control code point is mapped to an output control code
point outside the mnemonic-based common set
 - If the graphic encoding space of the source is larger than the
graphic encoding space of the target, some graphic code points will be
mapped to control code points, and vice versa.

Tony H.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Reply via email to