Re: UTF to unicode conversion

2004-07-02 Thread Otto Stolz
Hello,
Mike Ayers has written:
Who said that Unicode is high-tech?
Here is a device to generate UTF-8 that employs traditional tools such 
as ASCII art, paper, scissors, glue, brain.
Attached is a similar device for converting Unicode scalar values
to UTF-16 (UTF-16BE, that is, but you could easily add a final step
to compute UTF-16LE, or to add a BOM).
Definitely, the world has longed for this, for years ;-)  Enjoy!
Cheers,
  Otto Stolz
þÿAvers: Print with a fixed-with font, such 
as Lucida Console,

and cut out.



%T%P%P%P%P%P%P%P%P%P%P%P%P%f%P%P%P%P%P%P%P%P%P%P%P%P%P%f%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%W

%Q     U+0000 %Q W  X  Y  Z  %Q Otto s Magic 
Pocket Encoder     %Q

%Q     U+D7FF %Q !“  !“  !“  !“  %Q for  
UTF-16!"%T%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%c

%Q       WXYZ %Q _  _  _  _  %Q             %Q 
   V!’vv %    V!’vv %Q

%_%%%%%%%%%%%%%k%%%%%%%%%%%%%%b Version 1.0 %Q 
   U!’uu %    U!’uu %Q

%Q     U+E000 %Q W  X  Y  Z  %Q ©2004-07-02 %Q 
tt!T    % tt!T    %Q

%Q     U+FFFF %Q !“  !“  !“  !“  %Q             %Q 
   _!__ %    _!__ %Q

%Q       WXYZ %Q _  _  _  _  %Q             %Q 
%%%%%%%%%<%%%%%%%% %Q

%_%%%%%%%%%%%%%k%%%%%%%%%%%%%%Z%P%P%P%P%P%P%P%P%P%P%P%P%P%c 
   0!”00 % 13!8!”20 %Q

%Q U%P00010000 %Q 31 2t tu uv % 31 3v Y  Z  %Q 
00!1!”01 % 20!9!”21 %Q

%Q U%P000FFFFF %Q !“  2_ __ __ % !“  3_ !“  !“  %Q 
01!2!”02 % 21!A!”22 %Q

%Q      TUVYZ %Q D  _  _  _  % D  _  _  _  %Q 
02!3!”03 % 22!B!”23 %Q

%_%%%%%%%%%%%%%k%%%%%%%%%%%%%%<%%%%%%%%%%%%%%b 
03!4!”10 % 23!C!”30 %Q

%Q U%P00100000 %Q 31 23 3u uv % 31 3v Y  Z  %Q 
10!5!”11 % 30!D!”31 %Q

%Q U%P0010FFFF %Q !“  !“  3_ __ % !“  3_ !“  !“  %Q 
11!6!”12 % 31!E!”32 %Q

%Q       UVYZ %Q D  B  _  _  % D  _  _  _  %Q 
12!7!”13 % 32!F!”33 %Q

%Z%P%P%P%P%P%P%P%P%P%P%P%P%i%P%P%P%P%P%P%P%P%P%P%P%P%P%g%P%P%P%P%P%P%P%P%P%P%P%P%P%i%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%]





....:....1....:....2....:....3....:....4....:....5....:....6..





Revers: Cut out and paste on back of avers.



%T%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%W

%Q     Otto s Magic Pocket Encoder for 
UTF-16 Version 1.0     %Q

%Q     User s Manual     (inspired from Cima 
s UTF-8 MPE)     %Q

%`%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%c

%Q " Left column: min and max Unicode scalar 
values: pick the  %Q

%Q  row that applies to the code point to 
be converted.       %Q

%Q  T &Z mark the hexadecadic digits that 
have to be processed.%Q

%Q " Central column: work area to compute 
UTF-16BE code units. %Q

%Q " Right column: hexadecadic to quaternary 
conversion tables:%Q

%Q  ! for T to tt; !” for U/V to uu/VV (step 
1) and for step 2.%Q

%Q1. Convert each digit marked by T &V from 
hex to quat. Write %Q

%Q   quat digits on the undersores placed 
under letters t &v.  %Q

%Q2. Convert 2-digit quat numbers to hex 
digits or copy digits%Q

%Q   W &Z, as indicated, and write them  on 
the underscores on %Q

%Q   the next line. That s your UTF-16BE 
sequence in hex.     %Q

%Q!“ Downwards arrow

RE: UTF to unicode conversion

2004-06-30 Thread Mike Ayers
Title: RE: UTF to unicode conversion






    I still have my original somewhere at home.  I love this newfangled cut and paste technology!



/|/|ike


> -Original Message-
> From: Marco Cimarosti [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, June 30, 2004 7:39 AM
> To: 'Mike Ayers'; 'johncy inbaraj'; [EMAIL PROTECTED]
> Subject: RE: UTF to unicode conversion
> 
> 
> Mike Ayers wrote:
> > Side 1 (print and cut out): 
> > ++---+---+--+ 
> > | U+ | yy zz |    Cima's UTF-8 Magic | Hex= | 
> > | U+007F | !  !  |    Pocket Encoder | B-4  | 
> > | YZ | .  .  |   |  | 
> > ++---+---+ Vers. 1.0 | 0=00 | 
> > | U+0080 | 3x xy | 2y zz | 16 March 2000 | 1=01 | 
> > [...]
> 
> Holy Hermes Trismegistos, I had forgotten this one! How's it 
> I had all that
> free time back in March 2000? :-)
> 
> _ Marco
> 





RE: UTF to unicode conversion

2004-06-30 Thread Marco Cimarosti
Mike Ayers wrote:
> Side 1 (print and cut out): 
> ++---+---+--+ 
> | U+ | yy zz |Cima's UTF-8 Magic | Hex= | 
> | U+007F | !  !  |Pocket Encoder | B-4  | 
> | YZ | .  .  |   |  | 
> ++---+---+ Vers. 1.0 | 0=00 | 
> | U+0080 | 3x xy | 2y zz | 16 March 2000 | 1=01 | 
> [...]

Holy Hermes Trismegistos, I had forgotten this one! How's it I had all that
free time back in March 2000? :-)

_ Marco



Re: RE: UTF to unicode conversion

2004-06-30 Thread Philippe VERDY
"Mike Ayers" wrote to "johncy inbaraj" , [EMAIL PROTECTED]
> Here's my favorite, gleaned from the archives, courtesy of Marco Cimarosti.  
> It doesn't have instructions for working backwards, but once you figure out how to 
> work forwards, reversing the operation is pretty straightforward.  Make sure to use 
> a nonproportional font so everything lines up.
> Who said that Unicode is high-tech? 
> Here is a device to generate UTF-8 that employs traditional tools such as ASCII art, 
> paper, scissors, glue, brain. 
> 
> Side 1 (print and cut out): 
> ++---+---+--+ 
> | U+ | yy zz |Cima's UTF-8 Magic | Hex= | 
> | U+007F | !  !  |Pocket Encoder | B-4  | 
> | YZ | .  .  |   |  | 
> ++---+---+ Vers. 1.0 | 0=00 | 
> | U+0080 | 3x xy | 2y zz | 16 March 2000 | 1=01 | 
> | U+07FF | 3. .. | 2. !  |   | 2=02 | 
> |XYZ | .  .  | .  .  |  M.C. | 3=03 | 
> ++---+---+---+   | 4=10 | 
> | U+0800 | 32 ww | 2x xy | 2y zz |   | 5=11 | 
> | U+ | !  !  | 2. .. | 2. !  |   | 6=12 | 
> |   WXYZ | E  .  | .  .  | .  .  |   | 7=13 | 
> ++---+---+---+---+ 8=20 | 
> | U-0001 | 33 0v | 2v ww | 2x xy | 2y zz | 9=21 | 
> | U-000F | !  0. | 2. !  | 2. .. | 2. !  | A=22 | 
> |  VWXYZ | F  .  | .  .  | .  .  | .  .  | B=23 | 
> ++---+---+---+---+ C=30 | 
> | U-0010 | 33 1v | 2v ww | 2x xy | 2y zz | D=31 | 
> | U-0010 | !  1. | 2. !  | 2. .. | 2. !  | E=32 | 
> |  VWXYZ | F  .  | .  .  | .  .  | .  .  | F=33 | 
> ++---+---+---+---+--+ 

Change the last row like this:
++---+---+---+---+ C=30 | 
| U-0010 | 33 10 | 20 ww | 2x xy | 2y zz | D=31 | 
| U-0010 | !  !  | !  !  | 2. .. | 2. !  | E=32 | 
|   WXYZ | F  4  | 8  .  | .  .  | .  .  | F=33 | 
++---+---+---+---+--+ 
(because the value of V is fixed to zero)

This idea through base 4 is nice... ;-)
Thanks...



RE: UTF to unicode conversion

2004-06-29 Thread Mike Ayers
Title: RE: UTF to unicode conversion






From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of johncy inbaraj
Sent: Tuesday, June 29, 2004 6:07 AM


 I need a conversion logic which converts a UTF character to unicode character. If any, pls tell me.


    Here's my favorite, gleaned from the archives, courtesy of Marco Cimarosti.  It doesn't have instructions for working backwards, but once you figure out how to work forwards, reversing the operation is pretty straightforward.  Make sure to use a nonproportional font so everything lines up.




Who said that Unicode is high-tech?
Here is a device to generate UTF-8 that employs traditional tools such as ASCII art, paper, scissors, glue, brain.



Side 1 (print and cut out):


++---+---+--+
| U+ | yy zz |    Cima's UTF-8 Magic | Hex= |
| U+007F | !  !  |    Pocket Encoder | B-4  |
| YZ | .  .  |   |  |
++---+---+ Vers. 1.0 | 0=00 |
| U+0080 | 3x xy | 2y zz | 16 March 2000 | 1=01 |
| U+07FF | 3. .. | 2. !  |   | 2=02 |
|    XYZ | .  .  | .  .  |  M.C. | 3=03 |
++---+---+---+   | 4=10 |
| U+0800 | 32 ww | 2x xy | 2y zz |   | 5=11 |
| U+ | !  !  | 2. .. | 2. !  |   | 6=12 |
|   WXYZ | E  .  | .  .  | .  .  |   | 7=13 |
++---+---+---+---+ 8=20 |
| U-0001 | 33 0v | 2v ww | 2x xy | 2y zz | 9=21 |
| U-000F | !  0. | 2. !  | 2. .. | 2. !  | A=22 |
|  VWXYZ | F  .  | .  .  | .  .  | .  .  | B=23 |
++---+---+---+---+ C=30 |
| U-0010 | 33 1v | 2v ww | 2x xy | 2y zz | D=31 |
| U-0010 | !  1. | 2. !  | 2. .. | 2. !  | E=32 |
|  VWXYZ | F  .  | .  .  | .  .  | .  .  | F=33 |
++---+---+---+---+--+


Side 2 (print, cut out, and glue on back of side 1):


+---+
| Cima's UTF-8 Magic Pocket Encoder - User's Manual |
| (vers. 1.0, 16 March 2000, by Marco Cimarosti)    |
|   |
| - Left column: min and max Unicode scalar values: |
|   pick the row that applies to the code point you |
|   want to convert to UTF-8. Letters V..Z mark the |
|   hexadecimal digits that have to be processed.   |
| - Right column: hexadecimal to base-4 table.  |
| - Central columns: work area to compute each octet|
|   (1 to 4) that constitute UTF-8 octet sequences. |
| Convert each digit marked by V..Z from hex. to    |
| b.-4. Write b.-4 digits on the dots placed under  |
| letters v..z (two b.-4 digits per hex. digit).    |
| Convert 2-digit base-4 number to hex. digits and  |
| write them on the dots on the line. That is your  |
| UTF-8 sequence in hex.! Exclamation marks show    |
| passages that may be skipped, either because the  |
| digit is hard-coded, or because it may be copied  |
| directly from the scalar value.   |
+---+


Enjoy!


Marco












Re: UTF to unicode conversion

2004-06-29 Thread Chris Jacobs



Chapter 2, section 2.5 and 2.6
http://www.unicode.org/versions/Unicode4.0.1/
 
- Original Message - 

  From: 
  johncy inbaraj 
  
  To: [EMAIL PROTECTED] 
  Sent: Tuesday, June 29, 2004 3:07 
PM
  Subject: UTF to unicode conversion
  
  Hi,
   
   
  I need a conversion logic which converts a UTF character to unicode character. 
  If any, pls tell me.
   
  Regards,
  Johncy.Rejoice in the Lord Always!
  
  
  Do you Yahoo!?New 
  and Improved Yahoo! Mail - Send 10MB messages!