Terry J. Reedy added the comment:

Opening a duplicate issue to rant against the developers is not responsible 
behavior. Since you do not seem to understand Martin's 2.x solution, ask for 
help on python-list or elsewhere (and read below). The proper fix for multiple 
Unicode and text coding problems was and is to use Unicode for text, as we did 
and do in 3.x.

Note that while we link to sqlite3 with a Python interface, and choose that as 
the database to link to in the stdlib, we do not control sqlite3 itself. As 
documented and as Martin wrote, sqlite *assumes*, by default, that byte-encoded 
text handed to it is error-free utf-8 encoded. However, docs and Martin both 
say that you can override that assumption by replacing its text_factory. Sqlite 
should not reject *any* bytes because anything *could* be just what the use 
intended.

The problem of multiple byte encodings for text and of encoding info getting 
separated from encoded bytes is a general one. We constantly get questions on 
python-list like "how do I determine the real encoding of a web page if the 
encoding information is missing or wrong". We are doing our part to solve it by 
using unicode for text and pushing utf-8 as the one, true encoding that 
everyone should use whenever possible.

If you need more explanation, try python-list, as I said before.

----------
nosy: +terry.reedy
resolution:  -> duplicate
status: open -> closed
title: sqlite3 accepts strings it cannot return -> sqlite3 accepts strings it 
cannot (by default) return

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16783>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to