On 12/7/06, Da Martian <[EMAIL PROTECTED]> wrote:
I am still having issues trying to get my charaters standardizsed. I spent all of yesterday playing with ideas but it still in the dark.
Whatever you were doing the first time was fine:
So if I look at a name with umlaughts in the database via sqlite3.exe I get: Städt. Klinikum Neunkirchen gGmbH -- | an "a" with two dots on top
That text was properly encoded as UTF-8. The ONLY issue with that line is that the sqlite shell under Windows is incapable of displaying Unicode, so you need to retrieve the data from sqlite using a tool that is. The actual storage of it is perfect.
Part of my problem is I dont have a clue what my source data is encoded as. Does anyone know of a tool which can try and guess the encoding? Basically its a custom java bean written by someone else. It takes reports from a third party system turns them into XML using a string buffer. They just append everything to a string buffer. The code which actually adds this to the output (the key peice) I cant actually see at this point. So by my best guess based on research is that java usually uses UTF-16. But if this is so, it should work.
It sounds as though it is UTF-16 and working fine.
If I add the text using the *16 prepare and then retrieve it using the *16 column_text, I still get the two seperate characters instead of the umlaught thingie. So I can only assume that somehow my source isnt UTF-16. or I am converting it somewhere in the middle. This is possible since I am using Delphi and it has some implicit convertions, but I think I have got that under control.
AFAIK Delphi has no built-in Unicode support at all; you will need to find third-party support for everything, from processing to display controls. It is likely you are ending up with UTF-8 data at some point in the pipeline, and whatever you're doing to process it does not understand UTF-8.
The problem is if I copy my source and paste it into Notepad say, it shows correctly cause notepad then does it own stuff, and if I save the notepad and read that it works fine. *sigh*.
Notepad does support Unicode in various encodings, but that doesn't mean anything in this test, since your system codepage may well support the characters you're testing with anyway.
2) When using the NON16 version of prepare: If I add text which is in UTF16 what happens?
16 Version: If I add UTF16 text what happnes? if I add UTF-8 Text what happens? if I add ASCIII text what happnes?
The answers to these depend on exactly how you're interfacing with it (what programming language, how the sqlite library functions are defined/declared, any use of library tools or auto-conversion semantics in the language, etc). Show code :)