Hello,

Dominikus Scherkl (MGW) wrote about Cima's UTF-8 Magic Pocket Encoder:
Oha?
Updated without changing version and date?
;-)

I had provided a Magic Pocket Encoder for UTF-16, and afterwards have been made aware of some spelling, and wording, errors.

Mike Ayers has contributed the crowning achievement: his
UTF-32 Magic Pocket Encoder. This one is already perfect,
hence it will probably never reach version 1.1 :-)

Attached, you'll find the current versions of all three,
in a somewhat enhanced typography: I have exploited box-drawing
characters, arrows, and proper (typographical) apostrophes.
While not being ASCII proper, these MPEs use only characters
that were already present in CP 437 (the original PC code).

I haven't changed the wording, of course, exept the version
number and date, and the reference to arrows (rather than
exclamation points), as appropriate.

I hope this will end the discussion on MPEs, which are toys,
after all (though they could also be used to visualize the
three UTF encodings).

Cheers,
  Otto Stolz

Side 1 (print and cut out):

╔════════════╦═══════╦═══════════════════════╦══════╗
║     U+0000 ║ yy zz ║    Cima’s UTF-8 Magic ║ Hex↔ ║
║     U+007F ║ ↓  ↓  ║        Pocket Encoder ║ B-4  ║
║         YZ ║ _  _  ║                       ║      ║
╟────────────╫───────╚═══════╗
     Vers. 1.1 ║ 0↔00 ║
║     U+0080 ║ 3x xy │ 2y zz ║    2004-06-30 ║ 1↔01 ║
║     U+07FF ║ 3_ __ │ 2_ ↓  ║               ║ 2↔02 ║
║        XYZ ║ _  _  │ _  _  ║          M.C. ║ 3↔03 ║
╟────────────╫───────┼───────╚═══════╗
       ║ 4↔10 ║
║     U+0800 ║ 32 ww │ 2x xy │ 2y zz ║       ║ 5↔11 ║
║     U+FFFF ║ ↓  ↓  │ 2_ __ │ 2_ ↓  ║       ║ 6↔12 ║
║       WXYZ ║ E  _  │ _  _  │ _  _  ║       ║ 7↔13 ║
╟────────────╫───────┼───────┼───────╚═══════╣
 8↔20 ║
║ U-00010000 ║ 33 0v │ 2v ww │ 2x xy │ 2y zz ║ 9↔21 ║
║ U-000FFFFF ║ ↓  0_ │ 2_ ↓  │ 2_ __ │ 2_ ↓  ║ A↔22 ║
║      VWXYZ ║ F  _  │ _  _  │ _  _  │ _  _  ║ B↔23 ║
╟────────────╫───────┼───────┼───────┼───────╢
 C↔30 ║ 
║ U-00100000 ║ 33 10 │ 20 ww │ 2x xy │ 2y zz ║ D↔31 ║ 
║ U-0010FFFF ║ ↓  ↓  │ ↓  ↓  │ 2_ __ │ 2_ ↓  ║ E↔32 ║ 
║       WXYZ ║ F  4  │ 8  _  │ _  _  │ _  _  ║ F↔33 ║ 
╚════════════╩═══════╧═══════╧═══════╧═══════╩══════╝

Side 2 (print, cut out, and glue on back of side 1):

╔═══════════════════════════════════════════════════╗
║ Cima’s UTF-8 Magic Pocket Encoder - User’s Manual ║
║ (vers. 1.1, 2004-06-30, by Marco Cimarosti)       ║
║                                                   ║
║ - Left column: min and max Unicode scalar values: ║
║   pick the row that applies to the code point you ║
║   want to convert to UTF-8. Letters V..Z mark the ║
║   hexadecimal digits that have to be processed.   ║
║ - Right column: hexadecimal to base-4 table.      ║
║ - Central columns: work area to compute each octet║
║   (1 to 4) that constitute UTF-8 octet sequences. ║
║ Convert each digit marked by V..Z from hex. to    ║
║ b.-4. Write b.-4 digits on the dots placed under  ║
║ letters v..z (two b.-4 digits per hex. digit).    ║
║ Convert 2-digit base-4 number to hex. digits and  ║
║ write them on the dots on the line. That is your  ║
║ UTF-8 sequence in hex. ↓ Downwards arrows show    ║
║ passages that may be skipped, either because the  ║
║ digit is hard-coded, or because it may be copied  ║
║ directly from the scalar value.                   ║
╚═══════════════════════════════════════════════════╝
Obverse: Print with a fixed-width font, such as Lucida Console,
and cut out.

╔════════════╦═════════════╦═════════════════════════════════╗
║     U+0000 ║ W  X  Y  Z  ║ Otto’s Magic Pocket Encoder     ║
║     U+D7FF ║ ↓  ↓  ↓  ↓  ║ for  
UTF-16™╔═══════════════════╣
║       WXYZ ║ _  _  _  _  ║             ║    V→vv │    V→vv ║
╟────────────╫─────────────╢ 
Version 1.1 ║    U→uu │    U→uu ║
║     U+E000 ║ W  X  Y  Z  ║ ©2004-07-05 ║ tt←T    │ tt←T    ║
║     U+FFFF ║ ↓  ↓  ↓  ↓  ║             ║    _←__ │    _←__ ║
║       WXYZ ║ _  _  _  _  ║             ║ 
────────┼──────── ║
╟────────────╫─────────────╚═════════════╣
    0↔00 │ 13←8↔20 ║
║ U-00010000 ║ 31 2t tu uv │ 31 3v Y  Z  ║ 00←1↔01 │ 20←9↔21 ║
║ U-000FFFFF ║ ↓  2_ __ __ │ ↓  3_ ↓  ↓  ║ 01←2↔02 │ 21←A↔22 
║
║      TUVYZ ║ D  _  _  _  │ D  _  _  _  ║ 02←3↔03 │ 22←B↔23 ║
╟────────────╫─────────────┼─────────────╢
 03←4↔10 │ 23←C↔30 ║
║ U-00100000 ║ 31 23 3u uv │ 31 3v Y  Z  ║ 10←5↔11 │ 30←D↔31 ║
║ U-0010FFFF ║ ↓  ↓  3_ __ │ ↓  3_ ↓  ↓  ║ 11←6↔12 │ 
31←E↔32 ║
║       UVYZ ║ D  B  _  _  │ D  _  _  _  ║ 12←7↔13 │ 32←F↔33 ║
╚════════════╩═════════════╧═════════════╩═══════════════════╝


....:....1....:....2....:....3....:....4....:....5....:....6..


Reverse: Cut out and paste on back of obverse.

╔════════════════════════════════════════════════════════════╗
║     Otto’s Magic Pocket Encoder for UTF-16 Version 1.1     ║
║     User’s Manual     (inspired from Cima’s UTF-8 MPE)     ║
╠════════════════════════════════════════════════════════════╣
║• Left column: min and max Unicode scalar values: pick the  ║
║  row that applies to the code point to be converted.       ║
║  T…Z mark the hexadecadic digits that have to be processed.║
║• Central column: work area to compute UTF-16BE code units. ║
║• Right column: hexadecadic to quaternary conversion tables:║
║  ← for T to tt; ↔ for U/V to uu/VV (step 1) and for step 2.║
║1. Convert each digit marked by T…V from hex to quat. Write ║
║   quat digits on the underscores placed under letters t…v. ║
║2. Convert 2-digit quat numbers to hex digits or copy digits║
║   W…Z, as indicated, and write them on the underscores on  ║
║   the next line. That’s your UTF-16BE sequence in hex.     ║
║↓ Downwards arrows indicate shortcuts.                      ║
╚════════════════════════════════════════════════════════════╝

Enjoy.

Side 1 (print and cut out):

╔════════════╦═══════════════════════╤═══════════════╗
║ This space ║   Mike’s UTF-32 Magic │     Vers. 1.0 ║
║  for rent  ║        Pocket Encoder │  06 July 2004 ║
║            ║                       │               ║
╠════════════╬═══════╤═══════╤═══════╪═══════╗
       ║
║ U-00000000 ║ 0  0  │ U  V  │ W  X  │ Y  Z  ║       ║
║ U-0010FFFF ║ ↓  ↓  │ ↓  ↓  │ ↓  ↓  │ ↓  ↓  ║       ║
║     UVWXYZ ║ 0  0  │ _  _  │ _  _  │ _  _  ║       ║
╚════════════╩═══════╧═══════╧═══════╧═══════╩═══════╝

Side 2 (print, cut out, and glue on back of side 1):

╔════════════════════════════════════════════════════╗
║ Mike’s UTF-32 Magic Pocket Encoder - User’s Manual ║
║ (vers. 1.0, 6 July 2004, by Mike Ayers)            ║
║                                                    ║
║ - Left column: min and max Unicode scalar values.  ║
║   Letters U..Z mark the hexadecimal digits to be   ║
║   processed.  Read the bytes in the bottom row     ║
║   left to right, or right to left for UTF-32LE.    ║
╚════════════════════════════════════════════════════╝

Reply via email to