Daniel,

To represent every character from every alphabet, Unicode requires multiple bytes to represent each character (I think 21 bits are required, but most implementations simply use 4 bytes, 32 bits).

Since Unicode is cool you're probably interested in using it... On the other hand, you're probably not interested in encoding each of your characters using 4 bytes (quite an overkill for English for example). That's where UTF-8 (or UTF-16) comes in.

UTF-8 is a way of encoding the whole Unicode character set using sequences of 8 bit wide characters (1 byte). In other words, it uses multiple (from 1 to 4) 1-byte wide characters instead of one multibyte (32 bits) character. UTF-16 is similar, but uses sequences of 16 bits wide characters.

The advantage is that almost all western characters can be represented using one byte, just like to old days... Of course, because of special codes, the character set is not as large as ISO-8859-1 (for example), so certain latin characters we are used to require a specially crafted sequence of 2 (or more) bytes to represent.

For example, the letter é (0xE9 for ISO-8859-1) is represented as é (0xC3 0xA9 in UTF-8).

The XML standard specifies that if the encoding of a file is not specified, it is UTF-8 by default. So, if I were to write é in an UTF-8 file, it wouldn't be read as 0xE9, but as an invalid sequence...

It's generaly good practice to set the ecoding of an XML file.

Hope this helps,
Philippe

Daniel H. F. e Silva wrote:
Hi Brice,

 I know that UTF-8 is supposed to cover all kind of alphabet characters 
available in Earth. But, i
had faced this issue sometimes ago, when a parser tried to parse a file with 
encoding=utf-8 set
and simply gave me an error message. After thinking about that issue, i simply 
changed
encoding=utf-8 to encoding=iso-8859-1 and that fixed my problem.
 So, if UTF-8 is really supposed to cover all kind of stuff, why did i get that 
error? I'd love to
hear your explanation.

Cheers,
Daniel Silva.


--- Brice Ruth <[EMAIL PROTECTED]> wrote:

What special characters aren't supported by UTF-8?! I have never heard
of such a thing. My understanding is that UTF-8 represents the full
Unicode character set as a multi-byte value. And since Unicode is
supposed to encompass all known characters for all known languages
(with space for new Chinese characters created daily) - what's not
covered?!

There most certainly shouldn't be anything that iso-8859-1 or latin1
(Windows-1252) covers that is not in Unicode.

Brice

On 4/20/05, Daniel H. F. e Silva <[EMAIL PROTECTED]> wrote:

You could check also your xml encoding. If you work with special charaters not in utf-8, you

will

get in trouble.
I had this as my native language is portuguese and we have some special 
characters not

supported

by utf-8.
So, if this is your case, try iso-8859-1 or one that fits better to your needs.

Cheers,
Daniel Silva.


--- Larry Meadors <[EMAIL PROTECTED]> wrote:

Make sure that there is no white space and no odd chars at the top of your
config file.

Larry


On 4/18/05, KK <[EMAIL PROTECTED]> wrote:

I get the following error when I try to build sqlCOnfigmap..does it
look familiar to someone?

com.ibatis.sqlmap.client.SqlMapException: There was an error while
building the SqlMap instance.
--- The error occurred in the SQL Map Configuration file.
--- Cause: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error.
Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8
sequence.
Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte
UTF-8 sequence.
Caused by: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error.
Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8
sequence.
Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte
UTF-8 sequence.
at com.ibatis.sqlmap.engine.builder.xml.XmlSqlMapClientBuilder.buildSqlMap
(XmlSqlMapClientBuilder.java:203)
at com.ibatis.sqlmap.client.
SqlMapClientBuilder.buildSqlMapClient(SqlMapClientBuilder.java:49)

Your help is greatly appreciated.

Thanks,
KK


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com



--
Brice Ruth
Software Engineer, Madison WI



__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com





Reply via email to