Re: [sqlite] Unicode Help
Da Martian [EMAIL PROTECTED] wrote: When using the NON16 version of prepare: If I add text which is in UTF16 what happens? 16 Version: If I add UTF16 text what happnes? if I add UTF-8 Text what happens? if I add ASCIII text what happnes? You seem really confused about the whole encoding issue. Unless you take specific actions to make it otherwise, SQLite stores all text internally as UTF-8. I assume that is what you are doing, but it really does not matter because the API works exactly the same regardless of how SQLite stores the text internally. All text inputs to SQLite are expected to be UTF-8, or in the case of sqlite3...16() routines, UTF-16. No exceptions. SQLite never accepts text encoded using a microsoft codepage. If you send UTF-16 or some goofy microsoft codepage to an SQLite API that expects UTF-8, then you will end up with chaos. If you send UTF-8 or a codepage into one of the sqlite3...16() APIs, then you will end up with chaos. Don't do these things. Hand most SQLite APIs UTF-8 text. Send the SQLite APIs that end in 16 UTF-16 text. Do any format conversions ahead of time. If you just follow those simple rules, everything will work. -- D. Richard Hipp [EMAIL PROTECTED] - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Unicode Help
Hi You seem really confused about the whole encoding issue. Yes definatly confused, I had always hope unicode would simplify the world, but my experiences have shown no such luck :-) Codepages haunted my past and encodings haunt my future :-) Ok, that does answer one of my questions I think. If I passed something not in UTF-8 to sqlite would it return it exactly the same way I passed it in? From your statement of chaos below I assume it wont if that data somehow violates UTF-8. So I need to get it to UTF-8 or UTF16 before I insert. Thanks for the information.
Re: [sqlite] Unicode Help
On 12/7/06, Da Martian [EMAIL PROTECTED] wrote: Ok, that does answer one of my questions I think. If I passed something not in UTF-8 to sqlite would it return it exactly the same way I passed it in? From your statement of chaos below I assume it wont if that data somehow violates UTF-8. So I need to get it to UTF-8 or UTF16 before I insert. SQLite doesn't care much about what you feed it (remember you can also have BLOB's in fields), so if you feed it invalid UTF-8, it's invalid UTF-8 you get on return. The problem is when you then do do things like SELECT length(bad UTF-8 string), or many other text operations. Then you get wrong results. The biggest problem is when the database generated by your program is then read by UTF-8 aware programs (which should be all, but unfortunely they are not). An example could be an SQLite importer/exporter program, or some SQLite replicator program you get on the net and generates bad data because your data wasn't good in the first place. Also, i you want to hand edit your data with any of the many good SQLite GUI's, you may have problems. If you want to go the simple way (and only do Windows), then use the UTF-16 functions and forget about all this. As an advantage, windows NT internals uses Unicode, so you may have some performance gains in some places (even if negligible most of the time). Regards, ~Nuno Lucas - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Unicode Help
On 12/7/06, Nuno Lucas [EMAIL PROTECTED] wrote: On 12/7/06, Da Martian [EMAIL PROTECTED] wrote: Ok, that does answer one of my questions I think. If I passed something not in UTF-8 to sqlite would it return it exactly the same way I passed it in? From your statement of chaos below I assume it wont if that data somehow violates UTF-8. So I need to get it to UTF-8 or UTF16 before I insert. SQLite doesn't care much about what you feed it (remember you can also have BLOB's in fields), so if you feed it invalid UTF-8, it's invalid UTF-8 you get on return. This can also be broken if you do things like pass nul-terminated text with a length of -1. Anything which is not valid UTF-8 text should always be stored as BLOB and accessed in blob fashion. [Note that ASCII is valid UTF-8.] -scott - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Unicode Help
Hi I have a system up and working using sqlite3, but I think I am having unicode issues and I am not sure how I should go about coding the solution. I was hoping someone could share the approach needed. Here is my situation: I have german characters which Umlauts which I would like to get back out of sqlite. An example is an a with two little dots on the top. I have been using the non 16 versions. But in my mind thats ok, I just want whatever I put in back out again. The facts that its unicode should make a diff to sqlite. Unicode of 2 bytes say will be just be 2 normla chars to sqlite. At least this was an assumption. So if I look at a name with umlaughts in the database via sqlite3.exe I get: Städt. Klinikum Neunkirchen gGmbH -- | an a with two dots on top Now I expected that when this was put back into a unicode field it would be ok, but it doesnt seem to work. So I tried the *16 versions, but now the field size returned by sqlite3_column_bytes16 always seems to be larger than the string I get back resulting in junk characters on the end. So I get the Umlauts in my application but all this other junk as well. Any ideas ?