On Saturday, 18 January, 2020 05:21, Rocky Ji <rockyji3...@gmail.com> wrote:

>>  > GLOB supports character classes

>thanks for teaching new keyword and its use.

>My first attempt was very similar to what you suggest, except I used
>sqlite3 and re from inside Python.

>But as you see, I can't reliably seprate 'interrogative' question marks
>from question marks that get displayed due to 'encoding faults'.

>Any suggestions?

Ah.  So the real problem is that you stored non-text (text defined a UTF-8 
encoded sequence of unicode codepoints with no zero/null byte except at the 
end) in a database text field, and now you are trying to access those text 
fields with something that expects them to contain properly formatted text 
strings?  Or do you mean that they *are* valid UTF-8 encoded strings any you 
are trying to encode them as something else?

If the former you can retrieve the raw bytes in python by retrieving the field 
as cast(x as blob) and then .decode the result from whatever encoding it is in 
into proper unicode.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to