On Feb 9, 2014, at 5:34 AM, Erich Blume <blume.er...@gmail.com> wrote:

> 
> Then, you have to tell SQLAlchemy to convert these strings to unicode. I did 
> not persue this approach far enough to find the right set of arguments but I 
> imagine this would be very simple - set 'force_unicode' to True, I suspect, 
> would be all you would need.

if the SQLite connection is set up to return bytes ahead of when SQLAlchemy 
does anything with the connection, this will be automatic.  DBAPIs vary so much 
in this regard that we test the connection when the dialect first connects.   
but if you’re using connection events to achieve this, the event needs to be 
set in a certain way at the moment to make sure you get the connection before 
sqlalchemy does anything with it, there’s a ticket to document that.

> 
> Finally, for the column with the invalid utf-8 sequences, just also set the 
> `unicode_error` to your preferred resolution strategy - usually 'ignore' or 
> 'replace'.
> 
> I suppose it is possible that this could incur a performance penalty - the 
> sqlite3 de/encoding process is done in a compiled C module and as such could 
> possibly be faster than using native python for the task.

SQLAlchemy’s C extensions do the encoding/decoding and this process has been 
enhanced in 0.9 to also take the job of an expensive and sometimes-necessary 
“check if it’s already unicode” step.    I’ve already observed that SQLA’s C 
exts seem to be faster than MySQLdb’s “use_unicode”, Postgresqls unicode 
extension (which is unfortunate, we use that anyway) and using a unicode type 
with a cx_oracle outputtypehandler (which we’ve also stopped using as users 
complained about performance).

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to