On Feb 9, 2014, at 5:34 AM, Erich Blume <blume.er...@gmail.com> wrote:
> > Then, you have to tell SQLAlchemy to convert these strings to unicode. I did > not persue this approach far enough to find the right set of arguments but I > imagine this would be very simple - set 'force_unicode' to True, I suspect, > would be all you would need. if the SQLite connection is set up to return bytes ahead of when SQLAlchemy does anything with the connection, this will be automatic. DBAPIs vary so much in this regard that we test the connection when the dialect first connects. but if you’re using connection events to achieve this, the event needs to be set in a certain way at the moment to make sure you get the connection before sqlalchemy does anything with it, there’s a ticket to document that. > > Finally, for the column with the invalid utf-8 sequences, just also set the > `unicode_error` to your preferred resolution strategy - usually 'ignore' or > 'replace'. > > I suppose it is possible that this could incur a performance penalty - the > sqlite3 de/encoding process is done in a compiled C module and as such could > possibly be faster than using native python for the task. SQLAlchemy’s C extensions do the encoding/decoding and this process has been enhanced in 0.9 to also take the job of an expensive and sometimes-necessary “check if it’s already unicode” step. I’ve already observed that SQLA’s C exts seem to be faster than MySQLdb’s “use_unicode”, Postgresqls unicode extension (which is unfortunate, we use that anyway) and using a unicode type with a cx_oracle outputtypehandler (which we’ve also stopped using as users complained about performance).
signature.asc
Description: Message signed with OpenPGP using GPGMail