On Saturday, 18 January, 2020 05:21, Rocky Ji <rockyji3...@gmail.com> wrote:
>> > GLOB supports character classes >thanks for teaching new keyword and its use. >My first attempt was very similar to what you suggest, except I used >sqlite3 and re from inside Python. >But as you see, I can't reliably seprate 'interrogative' question marks >from question marks that get displayed due to 'encoding faults'. >Any suggestions? Ah. So the real problem is that you stored non-text (text defined a UTF-8 encoded sequence of unicode codepoints with no zero/null byte except at the end) in a database text field, and now you are trying to access those text fields with something that expects them to contain properly formatted text strings? Or do you mean that they *are* valid UTF-8 encoded strings any you are trying to encode them as something else? If the former you can retrieve the raw bytes in python by retrieving the field as cast(x as blob) and then .decode the result from whatever encoding it is in into proper unicode. -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users