Daniel,
To represent every character from every alphabet, Unicode requires multiple bytes to represent each character (I think 21 bits are required, but most implementations simply use 4 bytes, 32 bits).
Since Unicode is cool you're probably interested in using it... On the other hand, you're probably not interested in encoding each of your characters using 4 bytes (quite an overkill for English for example). That's where UTF-8 (or UTF-16) comes in.
UTF-8 is a way of encoding the whole Unicode character set using sequences of 8 bit wide characters (1 byte). In other words, it uses multiple (from 1 to 4) 1-byte wide characters instead of one multibyte (32 bits) character. UTF-16 is similar, but uses sequences of 16 bits wide characters.
The advantage is that almost all western characters can be represented using one byte, just like to old days... Of course, because of special codes, the character set is not as large as ISO-8859-1 (for example), so certain latin characters we are used to require a specially crafted sequence of 2 (or more) bytes to represent.
For example, the letter é (0xE9 for ISO-8859-1) is represented as é (0xC3 0xA9 in UTF-8).
The XML standard specifies that if the encoding of a file is not specified, it is UTF-8 by default. So, if I were to write é in an UTF-8 file, it wouldn't be read as 0xE9, but as an invalid sequence...
It's generaly good practice to set the ecoding of an XML file.
Hope this helps, Philippe
Daniel H. F. e Silva wrote:
Hi Brice,
I know that UTF-8 is supposed to cover all kind of alphabet characters available in Earth. But, i had faced this issue sometimes ago, when a parser tried to parse a file with encoding=utf-8 set and simply gave me an error message. After thinking about that issue, i simply changed encoding=utf-8 to encoding=iso-8859-1 and that fixed my problem. So, if UTF-8 is really supposed to cover all kind of stuff, why did i get that error? I'd love to hear your explanation.
Cheers,
Daniel Silva.
--- Brice Ruth <[EMAIL PROTECTED]> wrote:
What special characters aren't supported by UTF-8?! I have never heard of such a thing. My understanding is that UTF-8 represents the full Unicode character set as a multi-byte value. And since Unicode is supposed to encompass all known characters for all known languages (with space for new Chinese characters created daily) - what's not covered?!
There most certainly shouldn't be anything that iso-8859-1 or latin1 (Windows-1252) covers that is not in Unicode.
Brice
On 4/20/05, Daniel H. F. e Silva <[EMAIL PROTECTED]> wrote:
You could check also your xml encoding. If you work with special charaters not in utf-8, you
will
get in trouble. I had this as my native language is portuguese and we have some special characters not
supported
by utf-8. So, if this is your case, try iso-8859-1 or one that fits better to your needs.
Cheers, Daniel Silva.
--- Larry Meadors <[EMAIL PROTECTED]> wrote:
Make sure that there is no white space and no odd chars at the top of your config file.
Larry
On 4/18/05, KK <[EMAIL PROTECTED]> wrote:
I get the following error when I try to build sqlCOnfigmap..does it look familiar to someone?
com.ibatis.sqlmap.client.SqlMapException: There was an error while building the SqlMap instance. --- The error occurred in the SQL Map Configuration file. --- Cause: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error. Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8 sequence. Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8 sequence. Caused by: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error. Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8 sequence. Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8 sequence. at com.ibatis.sqlmap.engine.builder.xml.XmlSqlMapClientBuilder.buildSqlMap (XmlSqlMapClientBuilder.java:203) at com.ibatis.sqlmap.client. SqlMapClientBuilder.buildSqlMapClient(SqlMapClientBuilder.java:49)
Your help is greatly appreciated.
Thanks, KK
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
-- Brice Ruth Software Engineer, Madison WI
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com