Hello again, On Thursday, March 27, 2008 at 10:16:38 PM Peter [PP] (me) wrote:
[...] PP> UTF-8 is an *ENCODING* of these characters. In short: UTF-8 says which PP> position in the Unicode-table the character is at. UTF-8 uses 8 bit PP> for standard ISO-8859 characters and 16 bit for "special" characters PP> (like e.g. German umlauts). I have to correct myself: UTF-8 uses *more than* 8 bit for several classes of characters, 16 for "Latin letters with diacritics", 24 for "the rest of the Basic Multilingual Plane (which contains virtually all characters in common use)" and 32 for "characters in the other planes of Unicode, which are rarely used in practice." Source: <url:http://en.wikipedia.org/wiki/UTF-8> The rest still holds true, so UTF-8 ain't "bad" :-) IMHO it's "good", because it allows to break through this kind of "Babylonian" language and character table chaos as all relevant characters can be held in one table and therefore identified unambiguously. No more guessing: "Is this an 'ä' or 'æ' the author wanted to be displayed? Do I have to change the charset table to get the correct result shown?" Sure: for e-mail there is a method to declare the used character set, but not for text files ... And if used commonly for text products, why stick with an old system for e-mail? :-) So: let's switch and show the world: we're not dinosaurs, we use not only a modern MUA, but also see the advantages of a modern character encoding system ;-) -- Regards Peter Palmreuther (The Bat! v4.0.18.6 on Windows XP 5.1 Build 2600 Service Pack 2) A Stoic brings the baby; a Cynic is where you bathe it. ________________________________________________________ Current beta is 4.0.18.6 | 'Using TBBETA' information: http://www.silverstones.com/thebat/TBUDLInfo.html