On Dec 5, 2006, at 8:42 AM, Igor Tandetnik wrote:
Da Martian <[EMAIL PROTECTED]> wrote:
So if I look at a name with umlaughts in the database via sqlite3.exe
I get:
Städt. Klinikum Neunkirchen gGmbH
--
|
an "a" with two dots on top
"A with umlaut" is represented as two bytes in UTF-8.
This is a huge simplification. At a bare minimum, 'ä' can be
represented as either one or two Unicode code points -- one code point
represented 'ä' or one representing 'a' and one representing the '¨'
combining mark. How *that* is represented in the UTF-8 encoding of
Unicode is another issue, that depends on the exact values of the code
points involved.
The particular example of 'ä' be represented as two bytes in UTF-8 in
both cases (I don't know offhand) but that's not something that can be
generalized.
-- Chris
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------