Hi Michael, Indeed, you are correct - adding the message column to the cx_oracle query shows the same behaviour as the sqlalchemy query. Sorry I missed that.
I will take these results to the cx_oracle list on this basis. On my second question, though, is there a way to enforce convert_unicode='force' only on Oracle databases across an application? I know the hooking syntax changed from sqlalchemy 0.6 to 0.7, so I'd imagine if there is a way, it would be different for those two versions. thanks, -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com ----- Original Message ----- > Hi David - > > I've done some experiments with the script you gave me. Correct me > if I'm wrong (which is very possible), but the attached revised > version appears to demonstrate the difference between the cx_oracle > and SQLAlchemy versions is really just that the SQLAlchemy version > is running a different SQL string: > > SELECT test_table.colid, test_table.message > FROM test_table > WHERE test_table.colid = :colid_1 > > whereas the cx_oracle version, critically, does not ask for the > "message" column: > > SELECT test_table.colid FROM test_table WHERE test_table.colid = > :colid > > it's not clear to me if the placement of the "message" column impacts > the Oracle planner, or otherwise if the latency is on the cx_oracle > side regarding buffering of columns or something similar. But i > am able to get the SQLAlchemy version to be as fast as the cx_oracle > version, and the cx_oracle version to be as slow as the sqlalchemy > version, by just adding/removing the "message" column from the > query. > > The attached script also tries to eliminate the overhead of > SQLAlchemy compiling the SQL construct within the timed portion, as > I was testing this against only 40K rows and wanted to try to get > the results as close as possible. It's only dealing with the > unicode parameter, and it's true if you encode the parameter first, > both the SQLA and cx_oracle versions get faster, even to the point > that the two-column version minus unicode is faster than the one > column version with unicode. My guess is that cx_oracle is > detecting a unicode value in the input and doing something different > with the statement overall as a result of it being present. > > Some typical set of results are: > > Setting up table > sa_both_cols 1.20209717751 > sa_one_col 0.0182020664215 > sa_literal_sql 0.019690990448 > cx_oracle_one_col 0.0839619636536 > cx_oracle_both_cols 1.18438816071 > > Setting up table > sa_both_cols 1.17753505707 > sa_one_col 0.0785720348358 > sa_literal_sql 0.016618013382 > cx_oracle_one_col 0.0194280147552 > cx_oracle_both_cols 1.15302705765 > > while the cx_oracle/SQLA versions seem to compete for the "fast" > version, which is because we're only running the statement once and > I'm running on a low performing amazon small instance, it's clear > the "two column" version is where the vast amount of latency occurs > for both systems. > > > let me know if you can confirm similar results on your end. I tested > only with SQLAlchemy 0.8 in trunk but I was able to reproduce your > initial results so I assume the experience with 0.7, 0.6 would be > similar. > > > > > > > > > > On Aug 1, 2012, at 6:40 AM, David Moore wrote: > > > Hi all, > > > > This is a bug in a product with multiple versions in support. The > > issue can be reproduced with > > sqlalchemy: 0.6.1, 0.6.6, 0.7.8 > > cx_Oracle: 5.0.4, 5.1, 5.1.2 > > Oracle: 10g XE and 10g Enterprise > > > > We have tables created with sqlalchemy.String primary keys and > > other indexes. When querying through sqlalchemy, when we pass in > > a unicode value for the key, Oracle does not use the index, and > > instead uses a full table scan. When executing the exact same > > query through cx_Oracle, we don't see this issue. The script > > attached is a small test case which demonstrates the issue. > > Sample output: > > > > Running select > > Unicode params took 3.455 seconds > > Encoded params took 0 seconds > > Running cx_Oracle select > > Unicode params took 0.0619998 seconds > > Encoded params took 0 seconds > > > > And running the cx_Oracle select first to prevent pollution by > > caching: > > > > Running cx_Oracle select > > Unicode params took 0.0469999 seconds > > Encoded params took 0 seconds > > Running select > > Unicode params took 3.267 seconds > > Encoded params took 0.0159998 seconds > > > > As can be seen, there's a slight slowdown with the unicode > > parameter on cx_Oracle, but nothing compared to the slowdown on > > sqlalchemy. The sqlalchemy unicode parameter select is the only > > one which performs a full table scan, so that makes sense. > > > > This issue has been mentioned on this list before - 15 Feb with > > title "Full Table scan with Oracle due to Charset conversion" and > > 4 April with title "Problem with Oracle requests" - but in those > > cases, the issue was reproducible with the cx_Oracle driver as > > well, which it is not here. > > > > Using the convert_unicode='force' argument to the String types does > > solve the issue, but there are a few reasons I don't want to do > > that. This product also supports PostgreSQL, SQL Server and > > MySQL, and I do not want to make changes which will affect running > > under those databases. This is also happening in an otherwise > > very stable product under long-term support in a number of places, > > so whatever change is made needs to be minimal. > > > > Adding an additional function-based index on SYS_OP_C2C(column) for > > each of the affected columns also solves this issue with a slight > > performance change for the extra indexes which need to be kept up > > to date. This is the stopgap solution we've gone for in the > > meantime for those systems on production. It's not ideal, though, > > and a bit difficult to automate and then verify the results. We'd > > prefer to know why this was happening, and maybe fix a sqlalchemy > > issue in the process. > > > > So, my questions are: > > * What is sqlalchemy doing differently to straight cx_Oracle that > > triggers this behaviour? How can I debug this further to find > > out? > > * Is there a way, on sqlalchemy 0.6.x, to set up a hook which will > > use convert_unicode='force' only on an Oracle database and only > > for String columns, automatically across an application? There's > > no create_engine argument for convert_unicode='force', which is a > > pity. > > * And a hook for sqlalchemy 0.7.x to achieve the same result? > > > > Thanks in advance for any help, > > regards, > > -- > > David Moore > > Senior Software Engineer > > St. James Software > > Email: dav...@sjsoft.com > > > > -- > > You received this message because you are subscribed to the Google > > Groups "sqlalchemy" group. > > To post to this group, send email to sqlalchemy@googlegroups.com. > > To unsubscribe from this group, send email to > > sqlalchemy+unsubscr...@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/sqlalchemy?hl=en. > > > > <oracle_sqlalchemy_encoding_error.py> > > > -- > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > To post to this group, send email to sqlalchemy@googlegroups.com. > To unsubscribe from this group, send email to > sqlalchemy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.