And for anyone else experiencing this issue, there was a subtle difference in the execution plans that's now apparent. The statement which only selects colid runs a FAST FULL SCAN and successfully converts the unicode parameter using SYS_OP_C2C, and uses the index. When you select both colid and message, it runs a FULL SCAN, and even though it seems to detect the conversion can be done, it does not use the index at all. Perhaps this is actually an Oracle issue?
-- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com ----- Original Message ----- > Hi Michael, > > Indeed, you are correct - adding the message column to the cx_oracle > query shows the same behaviour as the sqlalchemy query. Sorry I > missed that. > > I will take these results to the cx_oracle list on this basis. On my > second question, though, is there a way to enforce > convert_unicode='force' only on Oracle databases across an > application? I know the hooking syntax changed from sqlalchemy 0.6 > to 0.7, so I'd imagine if there is a way, it would be different for > those two versions. > > thanks, > > -- > David Moore > Senior Software Engineer > St. James Software > Email: dav...@sjsoft.com > > > ----- Original Message ----- > > Hi David - > > > > I've done some experiments with the script you gave me. Correct me > > if I'm wrong (which is very possible), but the attached revised > > version appears to demonstrate the difference between the cx_oracle > > and SQLAlchemy versions is really just that the SQLAlchemy version > > is running a different SQL string: > > > > SELECT test_table.colid, test_table.message > > FROM test_table > > WHERE test_table.colid = :colid_1 > > > > whereas the cx_oracle version, critically, does not ask for the > > "message" column: > > > > SELECT test_table.colid FROM test_table WHERE test_table.colid = > > :colid > > > > it's not clear to me if the placement of the "message" column > > impacts > > the Oracle planner, or otherwise if the latency is on the cx_oracle > > side regarding buffering of columns or something similar. But i > > am able to get the SQLAlchemy version to be as fast as the > > cx_oracle > > version, and the cx_oracle version to be as slow as the sqlalchemy > > version, by just adding/removing the "message" column from the > > query. > > > > The attached script also tries to eliminate the overhead of > > SQLAlchemy compiling the SQL construct within the timed portion, as > > I was testing this against only 40K rows and wanted to try to get > > the results as close as possible. It's only dealing with the > > unicode parameter, and it's true if you encode the parameter first, > > both the SQLA and cx_oracle versions get faster, even to the point > > that the two-column version minus unicode is faster than the one > > column version with unicode. My guess is that cx_oracle is > > detecting a unicode value in the input and doing something > > different > > with the statement overall as a result of it being present. > > > > Some typical set of results are: > > > > Setting up table > > sa_both_cols 1.20209717751 > > sa_one_col 0.0182020664215 > > sa_literal_sql 0.019690990448 > > cx_oracle_one_col 0.0839619636536 > > cx_oracle_both_cols 1.18438816071 > > > > Setting up table > > sa_both_cols 1.17753505707 > > sa_one_col 0.0785720348358 > > sa_literal_sql 0.016618013382 > > cx_oracle_one_col 0.0194280147552 > > cx_oracle_both_cols 1.15302705765 > > > > while the cx_oracle/SQLA versions seem to compete for the "fast" > > version, which is because we're only running the statement once and > > I'm running on a low performing amazon small instance, it's clear > > the "two column" version is where the vast amount of latency occurs > > for both systems. > > > > > > let me know if you can confirm similar results on your end. I > > tested > > only with SQLAlchemy 0.8 in trunk but I was able to reproduce your > > initial results so I assume the experience with 0.7, 0.6 would be > > similar. > > > > > > > > > > > > > > > > > > > > On Aug 1, 2012, at 6:40 AM, David Moore wrote: > > > > > Hi all, > > > > > > This is a bug in a product with multiple versions in support. > > > The > > > issue can be reproduced with > > > sqlalchemy: 0.6.1, 0.6.6, 0.7.8 > > > cx_Oracle: 5.0.4, 5.1, 5.1.2 > > > Oracle: 10g XE and 10g Enterprise > > > > > > We have tables created with sqlalchemy.String primary keys and > > > other indexes. When querying through sqlalchemy, when we pass in > > > a unicode value for the key, Oracle does not use the index, and > > > instead uses a full table scan. When executing the exact same > > > query through cx_Oracle, we don't see this issue. The script > > > attached is a small test case which demonstrates the issue. > > > Sample output: > > > > > > Running select > > > Unicode params took 3.455 seconds > > > Encoded params took 0 seconds > > > Running cx_Oracle select > > > Unicode params took 0.0619998 seconds > > > Encoded params took 0 seconds > > > > > > And running the cx_Oracle select first to prevent pollution by > > > caching: > > > > > > Running cx_Oracle select > > > Unicode params took 0.0469999 seconds > > > Encoded params took 0 seconds > > > Running select > > > Unicode params took 3.267 seconds > > > Encoded params took 0.0159998 seconds > > > > > > As can be seen, there's a slight slowdown with the unicode > > > parameter on cx_Oracle, but nothing compared to the slowdown on > > > sqlalchemy. The sqlalchemy unicode parameter select is the only > > > one which performs a full table scan, so that makes sense. > > > > > > This issue has been mentioned on this list before - 15 Feb with > > > title "Full Table scan with Oracle due to Charset conversion" and > > > 4 April with title "Problem with Oracle requests" - but in those > > > cases, the issue was reproducible with the cx_Oracle driver as > > > well, which it is not here. > > > > > > Using the convert_unicode='force' argument to the String types > > > does > > > solve the issue, but there are a few reasons I don't want to do > > > that. This product also supports PostgreSQL, SQL Server and > > > MySQL, and I do not want to make changes which will affect > > > running > > > under those databases. This is also happening in an otherwise > > > very stable product under long-term support in a number of > > > places, > > > so whatever change is made needs to be minimal. > > > > > > Adding an additional function-based index on SYS_OP_C2C(column) > > > for > > > each of the affected columns also solves this issue with a slight > > > performance change for the extra indexes which need to be kept up > > > to date. This is the stopgap solution we've gone for in the > > > meantime for those systems on production. It's not ideal, > > > though, > > > and a bit difficult to automate and then verify the results. > > > We'd > > > prefer to know why this was happening, and maybe fix a sqlalchemy > > > issue in the process. > > > > > > So, my questions are: > > > * What is sqlalchemy doing differently to straight cx_Oracle > > > that > > > triggers this behaviour? How can I debug this further to find > > > out? > > > * Is there a way, on sqlalchemy 0.6.x, to set up a hook which > > > will > > > use convert_unicode='force' only on an Oracle database and only > > > for String columns, automatically across an application? > > > There's > > > no create_engine argument for convert_unicode='force', which is > > > a > > > pity. > > > * And a hook for sqlalchemy 0.7.x to achieve the same result? > > > > > > Thanks in advance for any help, > > > regards, > > > -- > > > David Moore > > > Senior Software Engineer > > > St. James Software > > > Email: dav...@sjsoft.com > > > > > > -- > > > You received this message because you are subscribed to the > > > Google > > > Groups "sqlalchemy" group. > > > To post to this group, send email to sqlalchemy@googlegroups.com. > > > To unsubscribe from this group, send email to > > > sqlalchemy+unsubscr...@googlegroups.com. > > > For more options, visit this group at > > > http://groups.google.com/group/sqlalchemy?hl=en. > > > > > > <oracle_sqlalchemy_encoding_error.py> > > > > > > -- > > You received this message because you are subscribed to the Google > > Groups "sqlalchemy" group. > > To post to this group, send email to sqlalchemy@googlegroups.com. > > To unsubscribe from this group, send email to > > sqlalchemy+unsubscr...@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/sqlalchemy?hl=en. > > > > > > -- > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > To post to this group, send email to sqlalchemy@googlegroups.com. > To unsubscribe from this group, send email to > sqlalchemy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.