At 11:12 AM -0400 10/23/07, Michael B Allen wrote:
On 10/23/07, tedd <[EMAIL PROTECTED]> wrote:
 At 7:21 PM -0400 10/21/07, John Campbell wrote:
 >The first thing to understand about character encoding is the overlap
 >between UTF-8 and 8859-1.  Below is a sample
 >a - lower case a (Same in 8859-1 & UTF-8)
 >à - a acute (Available in 8859-1 & UTF8 but different values..)
 >éí - Chinese character (Not in 8859-1, in UTF-8)

 A small clarification -- it's not really overlap,
 but rather UTF-8 is a super-set containing 8859-1
 like both contain ASCII.

Well if you want to be pedantic about it, "overlap" is more accurate.
UTF-8 is a multibyte encoding of the Unicode charset. ISO-8859-1 is a
single byte encoding of the ISO-8859-1 charset. So yes, Unicode is a
superset of ISO-8859-1 but the UTF-8 encoding of values above 0x7f are
not the same.

Mike


You are free to call it what you want.

True, the code-points for the ISO-8859-1 charset above 0x7F (the M$ spin) are not the same as UTF-* et al, but the glyphs are still included in UFT-8 regardless of encoding differences -- is that not true?

If this is true, then the term "overlap" would be less correct than "super-set" because the two sets do not overlap with respect to all code-points -- but the larger one still contain all the glyphs that the smaller one does (for the exception of Apple's spin on that set, which included adding their logo).

That's the reason I'm free to call one a super-set of the the other.

I believe it's easier to explain char-sets and code-points in terms of current Unicode standards than it is to point out historical differences that are diminishing in importance as more people convert.

Cheers,

tedd
--
-------
http://sperling.com  http://ancientstones.com  http://earthstones.com
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php

Reply via email to