On Feb 18, 2014, at 2:06 PM, Christoph Zwerschke <c...@online.de> wrote:

> The docstring for the cx-Oracle dialog says:
> 
> "SQLAlchemy will pass all unicode strings directly to cx_oracle, and 
> additionally uses an output handler so that all string based result values 
> are returned as unicode as well."
> 
> The latter does no longer seem to be true; the handler was recently removed 
> with ticket 2911.


good catch, I’ve rewritten the docs here: 
http://docs.sqlalchemy.org/en/latest/dialects/oracle.html#unicode.  The more 
common approach of using text() is included.


> 
> So now when I have varchar2 columns and do a simple query like this one, I 
> get encoded strings instead of unicode as before (in Python 2):
> 
> engine = create_engine('oracle+cx_oracle://..')
> con = engine.connect()
> for row in con.execute("select username from users"):
>    print row
> 
> Is this really intended? What am I supposed to do when I want to always get 
> unicode back?

it is very unfortunate but cx_Oracle has an unacceptable performance penalty 
for using this feature.   As you can see here: 
https://github.com/pydata/pandas/issues/2717#issuecomment-29046644  if I hadn’t 
been alerted to this, users who didn’t have the courtesy to notify me of this 
problem were ready to have the Pandas project entirely dump consideration of 
SQLAlchemy integration, for its supposed “cruft” and "2-4x the CPU cycles on 
top of your database driver”, when in fact this cruft and overhead is entirely 
within cx_Oracle.   All due to two lines of code.

Needless to say I’m a bit miffed that cx_Oracle’s huge performance bug doesn’t 
impact the reputation of cx_Oracle, but instead harms and slanders the 
SQLAlchemy project.   I’ve not had good results alerting cx_Oracle to other 
issues in the past (issues related to the RETURNING feature, two-phase 
transactions, etc) so it’s not really worth trying to get traction on this one.

So until a solution is found to the outputtypehandler unicode issue in 
cx_Oracle Python 2,  the recipe in the new docs are how it has to be for now 
(which is similar for other backends anyway), that is:

from sqlalchemy import text, Unicode
result = conn.execute(text("select username from 
user").columns(username=Unicode))

As far as restoring the outputtypehandler, i didn’t have good results trying to 
come up with a recipe to add one in from the outside that “nests” the one we 
already apply there for decimals.  So it you really need this as it was, using 
outputtypehandler, I will accept a pullreq that restores it, but turned off by 
default.  A flag called “coerce_to_unicode”, analogous to “coerce_to_decimal”, 
will default to False but when set to True will use the unicode 
outputtypehandler.   







Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to