Ok! Let me see if I can explain myself - I am not an expert on this so please correct me if I am wrong!
An UTF-8 representation of one character consists of at combination of characters. Now JAVA is a Unicode language and this means that one character
...of bytes.
can represent "any" type of character in the world!
Almost. Java's characters have only 16 bit, so there is a class of Unicode characters that need to be represented as a sequence of two Java characters.
Basically UTF-8 only makes sense when working on an "old" 7 bit asci system and you need to use characters not available in the given codepage.
UTF-8 always makes sense when you need backward compatibilty with ASCII.
Both UTF-8 and UTF-16 uses a varying number of bytes to represent one character, where Unicode always uses 32 bit characters (maybe it is 24 bit).
Unicode doesn't "represent" at all. Unicode is just a definition of code points.
*Encodings* represent Unicode characters as byte sequences, and UTF-8 and UTF-16 are some of the Unicode encoding.
> ...
Julian
-- <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]