Hello,
Dominikus Scherkl (MGW) wrote about Cima's UTF-8 Magic Pocket Encoder:
Oha? Updated without changing version and date? ;-)
I had provided a Magic Pocket Encoder for UTF-16, and afterwards have been made aware of some spelling, and wording, errors.
Mike Ayers has contributed the crowning achievement: his UTF-32 Magic Pocket Encoder. This one is already perfect, hence it will probably never reach version 1.1 :-)
Attached, you'll find the current versions of all three, in a somewhat enhanced typography: I have exploited box-drawing characters, arrows, and proper (typographical) apostrophes. While not being ASCII proper, these MPEs use only characters that were already present in CP 437 (the original PC code).
I haven't changed the wording, of course, exept the version number and date, and the reference to arrows (rather than exclamation points), as appropriate.
I hope this will end the discussion on MPEs, which are toys, after all (though they could also be used to visualize the three UTF encodings).
Cheers, Otto Stolz
Side 1 (print and cut out):
╔════════════╦═══════╦═══════════════════════╦══════╗ ║ U+0000 ║ yy zz ║ Cima’s UTF-8 Magic ║ Hex↔ ║ ║ U+007F ║ ↓ ↓ ║ Pocket Encoder ║ B-4 ║ ║ YZ ║ _ _ ║ ║ ║ ╟────────────╫───────╚═══════╗ Vers. 1.1 ║ 0↔00 ║ ║ U+0080 ║ 3x xy │ 2y zz ║ 2004-06-30 ║ 1↔01 ║ ║ U+07FF ║ 3_ __ │ 2_ ↓ ║ ║ 2↔02 ║ ║ XYZ ║ _ _ │ _ _ ║ M.C. ║ 3↔03 ║ ╟────────────╫───────┼───────╚═══════╗ ║ 4↔10 ║ ║ U+0800 ║ 32 ww │ 2x xy │ 2y zz ║ ║ 5↔11 ║ ║ U+FFFF ║ ↓ ↓ │ 2_ __ │ 2_ ↓ ║ ║ 6↔12 ║ ║ WXYZ ║ E _ │ _ _ │ _ _ ║ ║ 7↔13 ║ ╟────────────╫───────┼───────┼───────╚═══════╣ 8↔20 ║ ║ U-00010000 ║ 33 0v │ 2v ww │ 2x xy │ 2y zz ║ 9↔21 ║ ║ U-000FFFFF ║ ↓ 0_ │ 2_ ↓ │ 2_ __ │ 2_ ↓ ║ A↔22 ║ ║ VWXYZ ║ F _ │ _ _ │ _ _ │ _ _ ║ B↔23 ║ ╟────────────╫───────┼───────┼───────┼───────╢ C↔30 ║ ║ U-00100000 ║ 33 10 │ 20 ww │ 2x xy │ 2y zz ║ D↔31 ║ ║ U-0010FFFF ║ ↓ ↓ │ ↓ ↓ │ 2_ __ │ 2_ ↓ ║ E↔32 ║ ║ WXYZ ║ F 4 │ 8 _ │ _ _ │ _ _ ║ F↔33 ║ ╚════════════╩═══════╧═══════╧═══════╧═══════╩══════╝ Side 2 (print, cut out, and glue on back of side 1): ╔═══════════════════════════════════════════════════╗ ║ Cima’s UTF-8 Magic Pocket Encoder - User’s Manual ║ ║ (vers. 1.1, 2004-06-30, by Marco Cimarosti) ║ ║ ║ ║ - Left column: min and max Unicode scalar values: ║ ║ pick the row that applies to the code point you ║ ║ want to convert to UTF-8. Letters V..Z mark the ║ ║ hexadecimal digits that have to be processed. ║ ║ - Right column: hexadecimal to base-4 table. ║ ║ - Central columns: work area to compute each octet║ ║ (1 to 4) that constitute UTF-8 octet sequences. ║ ║ Convert each digit marked by V..Z from hex. to ║ ║ b.-4. Write b.-4 digits on the dots placed under ║ ║ letters v..z (two b.-4 digits per hex. digit). ║ ║ Convert 2-digit base-4 number to hex. digits and ║ ║ write them on the dots on the line. That is your ║ ║ UTF-8 sequence in hex. ↓ Downwards arrows show ║ ║ passages that may be skipped, either because the ║ ║ digit is hard-coded, or because it may be copied ║ ║ directly from the scalar value. ║ ╚═══════════════════════════════════════════════════╝
Obverse: Print with a fixed-width font, such as Lucida Console, and cut out. ╔════════════╦═════════════╦═════════════════════════════════╗ ║ U+0000 ║ W X Y Z ║ Otto’s Magic Pocket Encoder ║ ║ U+D7FF ║ ↓ ↓ ↓ ↓ ║ for UTF-16™╔═══════════════════╣ ║ WXYZ ║ _ _ _ _ ║ ║ V→vv │ V→vv ║ ╟────────────╫─────────────╢ Version 1.1 ║ U→uu │ U→uu ║ ║ U+E000 ║ W X Y Z ║ ©2004-07-05 ║ tt←T │ tt←T ║ ║ U+FFFF ║ ↓ ↓ ↓ ↓ ║ ║ _←__ │ _←__ ║ ║ WXYZ ║ _ _ _ _ ║ ║ ────────┼──────── ║ ╟────────────╫─────────────╚═════════════╣ 0↔00 │ 13←8↔20 ║ ║ U-00010000 ║ 31 2t tu uv │ 31 3v Y Z ║ 00←1↔01 │ 20←9↔21 ║ ║ U-000FFFFF ║ ↓ 2_ __ __ │ ↓ 3_ ↓ ↓ ║ 01←2↔02 │ 21←A↔22 ║ ║ TUVYZ ║ D _ _ _ │ D _ _ _ ║ 02←3↔03 │ 22←B↔23 ║ ╟────────────╫─────────────┼─────────────╢ 03←4↔10 │ 23←C↔30 ║ ║ U-00100000 ║ 31 23 3u uv │ 31 3v Y Z ║ 10←5↔11 │ 30←D↔31 ║ ║ U-0010FFFF ║ ↓ ↓ 3_ __ │ ↓ 3_ ↓ ↓ ║ 11←6↔12 │ 31←E↔32 ║ ║ UVYZ ║ D B _ _ │ D _ _ _ ║ 12←7↔13 │ 32←F↔33 ║ ╚════════════╩═════════════╧═════════════╩═══════════════════╝ ....:....1....:....2....:....3....:....4....:....5....:....6.. Reverse: Cut out and paste on back of obverse. ╔════════════════════════════════════════════════════════════╗ ║ Otto’s Magic Pocket Encoder for UTF-16 Version 1.1 ║ ║ User’s Manual (inspired from Cima’s UTF-8 MPE) ║ ╠════════════════════════════════════════════════════════════╣ ║• Left column: min and max Unicode scalar values: pick the ║ ║ row that applies to the code point to be converted. ║ ║ T…Z mark the hexadecadic digits that have to be processed.║ ║• Central column: work area to compute UTF-16BE code units. ║ ║• Right column: hexadecadic to quaternary conversion tables:║ ║ ← for T to tt; ↔ for U/V to uu/VV (step 1) and for step 2.║ ║1. Convert each digit marked by T…V from hex to quat. Write ║ ║ quat digits on the underscores placed under letters t…v. ║ ║2. Convert 2-digit quat numbers to hex digits or copy digits║ ║ W…Z, as indicated, and write them on the underscores on ║ ║ the next line. That’s your UTF-16BE sequence in hex. ║ ║↓ Downwards arrows indicate shortcuts. ║ ╚════════════════════════════════════════════════════════════╝ Enjoy.
Side 1 (print and cut out): ╔════════════╦═══════════════════════╤═══════════════╗ ║ This space ║ Mike’s UTF-32 Magic │ Vers. 1.0 ║ ║ for rent ║ Pocket Encoder │ 06 July 2004 ║ ║ ║ │ ║ ╠════════════╬═══════╤═══════╤═══════╪═══════╗ ║ ║ U-00000000 ║ 0 0 │ U V │ W X │ Y Z ║ ║ ║ U-0010FFFF ║ ↓ ↓ │ ↓ ↓ │ ↓ ↓ │ ↓ ↓ ║ ║ ║ UVWXYZ ║ 0 0 │ _ _ │ _ _ │ _ _ ║ ║ ╚════════════╩═══════╧═══════╧═══════╧═══════╩═══════╝ Side 2 (print, cut out, and glue on back of side 1): ╔════════════════════════════════════════════════════╗ ║ Mike’s UTF-32 Magic Pocket Encoder - User’s Manual ║ ║ (vers. 1.0, 6 July 2004, by Mike Ayers) ║ ║ ║ ║ - Left column: min and max Unicode scalar values. ║ ║ Letters U..Z mark the hexadecimal digits to be ║ ║ processed. Read the bytes in the bottom row ║ ║ left to right, or right to left for UTF-32LE. ║ ╚════════════════════════════════════════════════════╝