Hi,
´¯¯¯
>Despite of that, I'm aware that I have some more that pure US-ASCII in
>the
>blob objects, in fact I'm near your situation because used the Spanish
>languaje and have 8-bit extended ASCII with some special
>characters -accented characters and so-.
>
>So the question is Yes, I have upper-ANSI data stored, and need
>convert it
>to MS VCpp w_char strings to rebuild the dBase. In this point I plain use
>mbstowcs() to do the thing.
>
>Because there are some users in the field, the new app can detect if the
>dBase correspond to a previous version and rebuild it to the new
>version if
>needed. So I can be almost sure that the user use the same system code
>page
>that the one used when the data was
>originally inserted.
>
>Do is there some weird in my tought?
`---
Nothing weird, you're on the right track. In fact I'm writting an
SQLite extension to help people in this situation. I intend to make
the following scalar functions available, whose name reflect their usage:
{ "IsColumnPureASCII", 1, SQLITE_ANY, 0, PureASCII },
{ "IsColumnValidUTF8", 1, SQLITE_UTF8, 0, ValidUTF8 },
{ "IsColumnValidUTF16", 1, SQLITE_UTF16, 0, ValidUTF16 },
{ "ANSItoUTF8", 1, SQLITE_ANY, 0, AnsiToUTF8 },
{ "ANSItoUTF16", 1, SQLITE_ANY, 0, AnsiToUTF16 },
{ "OEMtoUTF8", 1, SQLITE_ANY, 0, OemToUTF8 },
{ "OEMtoUTF16", 1, SQLITE_ANY, 0, OemToUTF16 },
The "question" function IsColumn... return a boolean which you can
store in an extra flag column first to examine things closer, or decide
of an update (using one provided conversion) based on the boolean
return value to have the input ANSI blob column converted into yet
another new UTF text column. You can then dispose of the old blob
stuff and the flag column altogether.
This will be using calls to Windows converting functions. As you have
said, one only important "detail" is to ascertain that the machine used
for conversion uses the same system codepage that was in force when
upper-ANSI data was created. Only human examination can give answer at
this level, short of having strict rules about the expected contents of
the ANSI text, or, like in your case, use the same systems that were
used to insert data (with fairly good confidence that users don't
switch codepage every week).
Do you think it could help you, or other users?
Is there anything else useful to include in such extension?
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users