Hi,

´¯¯¯
>Despite of that, I'm aware that I have some more that pure US-ASCII in 
>the
>blob objects, in fact I'm near your situation because used the Spanish
>languaje and have 8-bit extended ASCII with some special
>characters -accented characters and so-.
>
>So the question is Yes, I have upper-ANSI data stored, and need 
>convert it
>to MS VCpp w_char strings to rebuild the dBase. In this point I plain use
>mbstowcs() to do the thing.
>
>Because there are some users in the field, the new app can detect if the
>dBase correspond to a previous version and rebuild it to the new 
>version if
>needed. So I can be almost sure that the user use the same system code 
>page
>that the one used when the data was
>originally inserted.
>
>Do is there some weird in my tought?
`---

Nothing weird, you're on the right track.  In fact I'm writting an 
SQLite extension to help people in this situation.  I intend to make 
the following scalar functions available, whose name reflect their usage:

         { "IsColumnPureASCII",  1, SQLITE_ANY,   0, PureASCII   },
         { "IsColumnValidUTF8",  1, SQLITE_UTF8,  0, ValidUTF8   },
         { "IsColumnValidUTF16", 1, SQLITE_UTF16, 0, ValidUTF16  },
         { "ANSItoUTF8",         1, SQLITE_ANY,   0, AnsiToUTF8  },
         { "ANSItoUTF16",        1, SQLITE_ANY,   0, AnsiToUTF16 },
         { "OEMtoUTF8",          1, SQLITE_ANY,   0, OemToUTF8   },
         { "OEMtoUTF16",         1, SQLITE_ANY,   0, OemToUTF16  },

The "question" function IsColumn... return a boolean which you can 
store in an extra flag column first to examine things closer, or decide 
of an update (using one provided conversion) based on the boolean 
return value to have the input ANSI blob column converted into yet 
another new UTF text column.  You can then dispose of the old blob 
stuff and the flag column altogether.

This will be using calls to Windows converting functions.  As you have 
said, one only important "detail" is to ascertain that the machine used 
for conversion uses the same system codepage that was in force when 
upper-ANSI data was created.  Only human examination can give answer at 
this level, short of having strict rules about the expected contents of 
the ANSI text, or, like in your case, use the same systems that were 
used to insert data (with fairly good confidence that users don't 
switch codepage every week).

Do you think it could help you, or other users?

Is there anything else useful to include in such extension?




_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to