Hi, Having had the same problem as above, I would like to use the "convert_unicode='force'" flag at engine configuration level. It seems that String tests the value of "convert_unicode" at its own level as well as engine level, but it isn't the case for the 'force' value check :
def bind_processor(self, dialect): if self.convert_unicode or dialect.convert_unicode: if dialect.supports_unicode_binds and \ *self.convert_unicode != 'force'*: ... def result_processor(self, dialect, coltype): wants_unicode = self.convert_unicode or dialect.convert_unicode needs_convert = wants_unicode and \ (dialect.returns_unicode_strings is not True or *self.convert_unicode == 'force'*) Thus I have to create a type decorator for all my Strings even if I have set the right flag at engine configuration level... Yann Le jeudi 2 août 2012 23:22:22 UTC+2, Michael Bayer a écrit : > > so the "convert_unicode='force'" flag is not ideal here as that will spend > most of it's time checking for decodes necessary on the result set side, > which is a lot of wasted effort. You can customize how strings are handled > on the bind side, including per-dialect behavior, using a custom type: > > class MyStringType(TypeDecorator): > impl = String > > def process_bind_param(self, value, dialect): > if value is not None and dialect.name == "oracle": > value = value.encode('utf-8') > > then replace usage of the String(length=XYZ) type with > MyStringType(length=XYZ). > > docs: http://docs.sqlalchemy.org/en/rel_0_7/core/types.html#types-custom > > > On Aug 2, 2012, at 5:41 AM, David Moore wrote: > > > Hi, > > > > Further searching seems to indicate this is an Oracle issue, not a > cx_Oracle issue. > > http://www.digipedia.pl/usenet/thread/15912/2814/ - same problem with > cx_Oracle > > http://support.unify.com/supportforum/viewtopic.php?f=40&t=3823 - > happening with perl drivers. > > http://tao.qshine.com/note/note_ora.htm - happening with ADO.NETdrivers. > > > > There's also evidence this problem occurs with MS SQL Server as well: > > > http://www.microsoftfaqs.com/Articles/1148439/Index_not_used_when_UNICODE_%3D_true- > SQL Server 2005 > > > http://blog.brianhartsock.com/2008/12/14/nvarchar-vs-varchar-in-sql-server-beware/ > > > > > Although, it seems in SQL Server at least, it still uses the index, just > in a slower manner, which is probably why we haven't noticed this issue > yet. > > > > It seems to me that the place to fix this is when the type of the column > is known, which is when sqlalchemy binds the parameter. There's some > discussion in one of those links that, at the sql statement preparation > level, you don't know what the column datatype is, so you can't fix it > there. > > > > Would it be possible for String datatypes to detect unicode values and > encode them with the database charset? > > > > As a lower-level temporary solution, you can get cx_Oracle to encode all > unicode parameters to strings, but that's obviously the wrong thing to do > if you have any Unicode columns. This snippet: > > > > def InputTypeHandler(cursor, value, numElements): > > if isinstance(value, unicode): > > return cursor.var(str, arraysize = numElements, > > inconverter = lambda x: > x.encode(cursor.connection.nencoding)) > > > > And then when creating the connection: > > > > connection.inputtypehandler = InputTypeHandler > > > > thanks for the help, > > > > -- > > David Moore > > Senior Software Engineer > > St. James Software > > Email: dav...@sjsoft.com <javascript:> > > > > > > ----- Original Message ----- > >> And for anyone else experiencing this issue, there was a subtle > >> difference in the execution plans that's now apparent. The > >> statement which only selects colid runs a FAST FULL SCAN and > >> successfully converts the unicode parameter using SYS_OP_C2C, and > >> uses the index. When you select both colid and message, it runs a > >> FULL SCAN, and even though it seems to detect the conversion can be > >> done, it does not use the index at all. Perhaps this is actually an > >> Oracle issue? > >> > >> -- > >> David Moore > >> Senior Software Engineer > >> St. James Software > >> Email: dav...@sjsoft.com <javascript:> > >> > >> > >> > >> ----- Original Message ----- > >>> Hi Michael, > >>> > >>> Indeed, you are correct - adding the message column to the > >>> cx_oracle > >>> query shows the same behaviour as the sqlalchemy query. Sorry I > >>> missed that. > >>> > >>> I will take these results to the cx_oracle list on this basis. On > >>> my > >>> second question, though, is there a way to enforce > >>> convert_unicode='force' only on Oracle databases across an > >>> application? I know the hooking syntax changed from sqlalchemy 0.6 > >>> to 0.7, so I'd imagine if there is a way, it would be different for > >>> those two versions. > >>> > >>> thanks, > >>> > >>> -- > >>> David Moore > >>> Senior Software Engineer > >>> St. James Software > >>> Email: dav...@sjsoft.com <javascript:> > >>> > >>> > >>> ----- Original Message ----- > >>>> Hi David - > >>>> > >>>> I've done some experiments with the script you gave me. Correct > >>>> me > >>>> if I'm wrong (which is very possible), but the attached revised > >>>> version appears to demonstrate the difference between the > >>>> cx_oracle > >>>> and SQLAlchemy versions is really just that the SQLAlchemy > >>>> version > >>>> is running a different SQL string: > >>>> > >>>> SELECT test_table.colid, test_table.message > >>>> FROM test_table > >>>> WHERE test_table.colid = :colid_1 > >>>> > >>>> whereas the cx_oracle version, critically, does not ask for the > >>>> "message" column: > >>>> > >>>> SELECT test_table.colid FROM test_table WHERE > test_table.colid = > >>>> :colid > >>>> > >>>> it's not clear to me if the placement of the "message" column > >>>> impacts > >>>> the Oracle planner, or otherwise if the latency is on the > >>>> cx_oracle > >>>> side regarding buffering of columns or something similar. But > >>>> i > >>>> am able to get the SQLAlchemy version to be as fast as the > >>>> cx_oracle > >>>> version, and the cx_oracle version to be as slow as the > >>>> sqlalchemy > >>>> version, by just adding/removing the "message" column from the > >>>> query. > >>>> > >>>> The attached script also tries to eliminate the overhead of > >>>> SQLAlchemy compiling the SQL construct within the timed portion, > >>>> as > >>>> I was testing this against only 40K rows and wanted to try to get > >>>> the results as close as possible. It's only dealing with the > >>>> unicode parameter, and it's true if you encode the parameter > >>>> first, > >>>> both the SQLA and cx_oracle versions get faster, even to the > >>>> point > >>>> that the two-column version minus unicode is faster than the one > >>>> column version with unicode. My guess is that cx_oracle is > >>>> detecting a unicode value in the input and doing something > >>>> different > >>>> with the statement overall as a result of it being present. > >>>> > >>>> Some typical set of results are: > >>>> > >>>> Setting up table > >>>> sa_both_cols 1.20209717751 > >>>> sa_one_col 0.0182020664215 > >>>> sa_literal_sql 0.019690990448 > >>>> cx_oracle_one_col 0.0839619636536 > >>>> cx_oracle_both_cols 1.18438816071 > >>>> > >>>> Setting up table > >>>> sa_both_cols 1.17753505707 > >>>> sa_one_col 0.0785720348358 > >>>> sa_literal_sql 0.016618013382 > >>>> cx_oracle_one_col 0.0194280147552 > >>>> cx_oracle_both_cols 1.15302705765 > >>>> > >>>> while the cx_oracle/SQLA versions seem to compete for the "fast" > >>>> version, which is because we're only running the statement once > >>>> and > >>>> I'm running on a low performing amazon small instance, it's clear > >>>> the "two column" version is where the vast amount of latency > >>>> occurs > >>>> for both systems. > >>>> > >>>> > >>>> let me know if you can confirm similar results on your end. I > >>>> tested > >>>> only with SQLAlchemy 0.8 in trunk but I was able to reproduce > >>>> your > >>>> initial results so I assume the experience with 0.7, 0.6 would be > >>>> similar. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On Aug 1, 2012, at 6:40 AM, David Moore wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> This is a bug in a product with multiple versions in support. > >>>>> The > >>>>> issue can be reproduced with > >>>>> sqlalchemy: 0.6.1, 0.6.6, 0.7.8 > >>>>> cx_Oracle: 5.0.4, 5.1, 5.1.2 > >>>>> Oracle: 10g XE and 10g Enterprise > >>>>> > >>>>> We have tables created with sqlalchemy.String primary keys and > >>>>> other indexes. When querying through sqlalchemy, when we pass > >>>>> in > >>>>> a unicode value for the key, Oracle does not use the index, and > >>>>> instead uses a full table scan. When executing the exact same > >>>>> query through cx_Oracle, we don't see this issue. The script > >>>>> attached is a small test case which demonstrates the issue. > >>>>> Sample output: > >>>>> > >>>>> Running select > >>>>> Unicode params took 3.455 seconds > >>>>> Encoded params took 0 seconds > >>>>> Running cx_Oracle select > >>>>> Unicode params took 0.0619998 seconds > >>>>> Encoded params took 0 seconds > >>>>> > >>>>> And running the cx_Oracle select first to prevent pollution by > >>>>> caching: > >>>>> > >>>>> Running cx_Oracle select > >>>>> Unicode params took 0.0469999 seconds > >>>>> Encoded params took 0 seconds > >>>>> Running select > >>>>> Unicode params took 3.267 seconds > >>>>> Encoded params took 0.0159998 seconds > >>>>> > >>>>> As can be seen, there's a slight slowdown with the unicode > >>>>> parameter on cx_Oracle, but nothing compared to the slowdown on > >>>>> sqlalchemy. The sqlalchemy unicode parameter select is the > >>>>> only > >>>>> one which performs a full table scan, so that makes sense. > >>>>> > >>>>> This issue has been mentioned on this list before - 15 Feb with > >>>>> title "Full Table scan with Oracle due to Charset conversion" > >>>>> and > >>>>> 4 April with title "Problem with Oracle requests" - but in > >>>>> those > >>>>> cases, the issue was reproducible with the cx_Oracle driver as > >>>>> well, which it is not here. > >>>>> > >>>>> Using the convert_unicode='force' argument to the String types > >>>>> does > >>>>> solve the issue, but there are a few reasons I don't want to do > >>>>> that. This product also supports PostgreSQL, SQL Server and > >>>>> MySQL, and I do not want to make changes which will affect > >>>>> running > >>>>> under those databases. This is also happening in an otherwise > >>>>> very stable product under long-term support in a number of > >>>>> places, > >>>>> so whatever change is made needs to be minimal. > >>>>> > >>>>> Adding an additional function-based index on SYS_OP_C2C(column) > >>>>> for > >>>>> each of the affected columns also solves this issue with a > >>>>> slight > >>>>> performance change for the extra indexes which need to be kept > >>>>> up > >>>>> to date. This is the stopgap solution we've gone for in the > >>>>> meantime for those systems on production. It's not ideal, > >>>>> though, > >>>>> and a bit difficult to automate and then verify the results. > >>>>> We'd > >>>>> prefer to know why this was happening, and maybe fix a > >>>>> sqlalchemy > >>>>> issue in the process. > >>>>> > >>>>> So, my questions are: > >>>>> * What is sqlalchemy doing differently to straight cx_Oracle > >>>>> that > >>>>> triggers this behaviour? How can I debug this further to find > >>>>> out? > >>>>> * Is there a way, on sqlalchemy 0.6.x, to set up a hook which > >>>>> will > >>>>> use convert_unicode='force' only on an Oracle database and > >>>>> only > >>>>> for String columns, automatically across an application? > >>>>> There's > >>>>> no create_engine argument for convert_unicode='force', which > >>>>> is > >>>>> a > >>>>> pity. > >>>>> * And a hook for sqlalchemy 0.7.x to achieve the same result? > >>>>> > >>>>> Thanks in advance for any help, > >>>>> regards, > >>>>> -- > >>>>> David Moore > >>>>> Senior Software Engineer > >>>>> St. James Software > >>>>> Email: dav...@sjsoft.com <javascript:> > >>>>> > >>>>> -- > >>>>> You received this message because you are subscribed to the > >>>>> Google > >>>>> Groups "sqlalchemy" group. > >>>>> To post to this group, send email to > >>>>> sqlal...@googlegroups.com <javascript:>. > >>>>> To unsubscribe from this group, send email to > >>>>> sqlalchemy+...@googlegroups.com <javascript:>. > >>>>> For more options, visit this group at > >>>>> http://groups.google.com/group/sqlalchemy?hl=en. > >>>>> > >>>>> <oracle_sqlalchemy_encoding_error.py> > >>>> > >>>> > >>>> -- > >>>> You received this message because you are subscribed to the > >>>> Google > >>>> Groups "sqlalchemy" group. > >>>> To post to this group, send email to > >>>> sqlal...@googlegroups.com<javascript:>. > > >>>> To unsubscribe from this group, send email to > >>>> sqlalchemy+...@googlegroups.com <javascript:>. > >>>> For more options, visit this group at > >>>> http://groups.google.com/group/sqlalchemy?hl=en. > >>>> > >>>> > >>> > >>> -- > >>> You received this message because you are subscribed to the Google > >>> Groups "sqlalchemy" group. > >>> To post to this group, send email to > >>> sqlal...@googlegroups.com<javascript:>. > > >>> To unsubscribe from this group, send email to > >>> sqlalchemy+...@googlegroups.com <javascript:>. > >>> For more options, visit this group at > >>> http://groups.google.com/group/sqlalchemy?hl=en. > >>> > >>> > >> > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "sqlalchemy" group. > >> To post to this group, send email to > >> sqlal...@googlegroups.com<javascript:>. > > >> To unsubscribe from this group, send email to > >> sqlalchemy+...@googlegroups.com <javascript:>. > >> For more options, visit this group at > >> http://groups.google.com/group/sqlalchemy?hl=en. > >> > >> > > > > -- > > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > > To post to this group, send email to > > sqlal...@googlegroups.com<javascript:>. > > > To unsubscribe from this group, send email to > sqlalchemy+...@googlegroups.com <javascript:>. > > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > > > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/xQHoRrrfOugJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.