Hi,

Please, follow Igor advices, he is right.


>[1] Read the actual textual data with sqlite3_column_blob()

Which you can directly convert to TEXT if, as you say, you entered only 
7-bit ASCII or UTF-8 compliant data.

>[2] Assuming the system code page matches the one used when the data was
>originally inserted, convert with mbstowcs()

Forget that.

>[3] (Doubt) The result can be directly written with 
>sqlite3_bind_text() -I
>want store in UTF-8-

Why not convert the column(s) directly from blob to TEXT?

>OR must I write the result with sqlite3_bind_text16()? Them, the data is
>stored as UTF-16? or as UTF-8?

Forget this too.

>[4] Afterward once converted the dBase

I take his to mean "converts your blobs to TEXT" : yes, this is the 
_only_ step you need for the whole thing.

>  and in regular use:
>
>[4-1a] Read with sqlite3_column_text()
>
>[4-1b] convert with WideCharToMultiByte(CP_UTF8)

Why?  Read it off as text16 direct into you cpp string , there must 
exist zillions [working] wrappers to VC++.

>[4-1c] Use the result with Win32 api -SetTex()-

No particular API.  Try a simple MsgBox to see by yourself, or even a 
basic "Héllô wörld!" as a Windows _console_ (printf).

>OR?
>
>[4-2a] Read with sqlite3_column_text16()
>[4-2b] No convertion needed.
>[4-2c] Use the result ...

YES !!!

The only catch would be if you have (knowingly or not) entered 8-bit 
data (the upper 128 characters in the Win codepage) encoded in __ANSI__ 
(1 char = 1 byte) and stored in your blobs.  In such case, the 
representation of the characters is not the expected UTF-8.

As far as I understand it, current SQLite _should_ read/write 
non-compliant UTF-8 without problem, but I didn't checked that latest 
versions are still byte-neutral.  But it is not the right lane.  If you 
have upper_ANSI strings, convert them to UTF-*, where * is your 
encoding choice for the new database.

<horror story>
I've spend indecent time sorting out all these questions, just because 
someone decided to silently convert UTF-16 native Windows stings to 
ANSI for every SQLite UTF-8 interface in the SQLite wrapper built into 
the development tool I use.

It was very hard for me to figure this out, because a call similar to 
printf _also_ converted to ANSI (silently) and a call used to display a 
2D table _also_ converted to ANSI (silently). So sometimes I had French 
and other European diacritics converted (i.e. completely destroyed) 
like this: UTF-16 --> ANSI --> UTF-16 --> ANSI.

It was just like if I tried to read original plain text by just looking 
at its MD5.  It's been a _real_ nightmare and I had data from 15 
countries, some using "weird" (to me) scripts.
</horror story>


But you're not even close to this terrible situation.

Just determine if you have upper-ANSI data stored and convert it if needed.

Well, stay tuned, I'll do something for you, just allow me some time...




_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to