Since I get only the digest late in the evening, someone else may have replied to this - if so, I apologize. UTF-8 encodes every character in the Unicode standards (so far). Code points from 0-x'7f'are coded as-is. Code points from x'80'-x'7ff' are encoded in two bytes, code points from x'800' to x'ffff' require 3 bytes, and code points from x'10000' to x'1ffff' (the current standard limit) require 4 bytes.
So, there is no character in the Unicode repertoire missing from UTF-8.

Dale Miller
dalelmil...@comcast.net

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to