Re: [sqlalchemy] Full table scan using Oracle String indexes

David Moore Thu, 02 Aug 2012 00:28:32 -0700

And for anyone else experiencing this issue, there was a subtle difference in 
the execution plans that's now apparent.  The statement which only selects 
colid runs a FAST FULL SCAN and successfully converts the unicode parameter 
using SYS_OP_C2C, and uses the index. When you select both colid and message, 
it runs a FULL SCAN, and even though it seems to detect the conversion can be 
done, it does not use the index at all.  Perhaps this is actually an Oracle 
issue?


-- 
David Moore
Senior Software Engineer
St. James Software
Email: dav...@sjsoft.com



----- Original Message -----
> Hi Michael,
> 
> Indeed, you are correct - adding the message column to the cx_oracle
> query shows the same behaviour as the sqlalchemy query.  Sorry I
> missed that.
> 
> I will take these results to the cx_oracle list on this basis.  On my
> second question, though, is there a way to enforce
> convert_unicode='force' only on Oracle databases across an
> application?  I know the hooking syntax changed from sqlalchemy 0.6
> to 0.7, so I'd imagine if there is a way, it would be different for
> those two versions.
> 
> thanks,
> 
> --
> David Moore
> Senior Software Engineer
> St. James Software
> Email: dav...@sjsoft.com
> 
> 
> ----- Original Message -----
> > Hi David -
> > 
> > I've done some experiments with the script you gave me.  Correct me
> > if I'm wrong (which is very possible), but the attached revised
> > version appears to demonstrate the difference between the cx_oracle
> > and SQLAlchemy versions is really just that the SQLAlchemy version
> > is running a different SQL string:
> > 
> >     SELECT test_table.colid, test_table.message
> >     FROM test_table
> >     WHERE test_table.colid = :colid_1
> > 
> > whereas the cx_oracle version, critically, does not ask for the
> > "message" column:
> > 
> >     SELECT test_table.colid FROM test_table WHERE test_table.colid =
> >     :colid
> > 
> > it's not clear to me if the placement of the "message" column
> > impacts
> > the Oracle planner, or otherwise if the latency is on the cx_oracle
> > side regarding buffering of columns or something similar.     But i
> > am able to get the SQLAlchemy version to be as fast as the
> > cx_oracle
> > version, and the cx_oracle version to be as slow as the sqlalchemy
> > version, by just adding/removing the "message" column from the
> > query.
> > 
> > The attached script also tries to eliminate the overhead of
> > SQLAlchemy compiling the SQL construct within the timed portion, as
> > I was testing this against only 40K rows and wanted to try to get
> > the results as close as possible.  It's only dealing with the
> > unicode parameter, and it's true if you encode the parameter first,
> > both the SQLA and cx_oracle versions get faster, even to the point
> > that the two-column version minus unicode is faster than the one
> > column version with unicode.   My guess is that cx_oracle is
> > detecting a unicode value in the input and doing something
> > different
> > with the statement overall as a result of it being present.
> > 
> > Some typical set of results are:
> > 
> > Setting up table
> > sa_both_cols 1.20209717751
> > sa_one_col 0.0182020664215
> > sa_literal_sql 0.019690990448
> > cx_oracle_one_col 0.0839619636536
> > cx_oracle_both_cols 1.18438816071
> > 
> > Setting up table
> > sa_both_cols 1.17753505707
> > sa_one_col 0.0785720348358
> > sa_literal_sql 0.016618013382
> > cx_oracle_one_col 0.0194280147552
> > cx_oracle_both_cols 1.15302705765
> > 
> > while the cx_oracle/SQLA versions seem to compete for the "fast"
> > version, which is because we're only running the statement once and
> > I'm running on a low performing amazon small instance, it's clear
> > the "two column" version is where the vast amount of latency occurs
> > for both systems.
> > 
> > 
> > let me know if you can confirm similar results on your end.  I
> > tested
> > only with SQLAlchemy 0.8 in trunk but I was able to reproduce your
> > initial results so I assume the experience with 0.7, 0.6 would be
> > similar.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Aug 1, 2012, at 6:40 AM, David Moore wrote:
> > 
> > > Hi all,
> > > 
> > > This is a bug in a product with multiple versions in support.
> > >  The
> > > issue can be reproduced with
> > > sqlalchemy: 0.6.1, 0.6.6, 0.7.8
> > > cx_Oracle: 5.0.4, 5.1, 5.1.2
> > > Oracle: 10g XE and 10g Enterprise
> > > 
> > > We have tables created with sqlalchemy.String primary keys and
> > > other indexes.  When querying through sqlalchemy, when we pass in
> > > a unicode value for the key, Oracle does not use the index, and
> > > instead uses a full table scan.  When executing the exact same
> > > query through cx_Oracle, we don't see this issue.  The script
> > > attached is a small test case which demonstrates the issue.
> > >  Sample output:
> > > 
> > >  Running select
> > >  Unicode params took 3.455 seconds
> > >  Encoded params took 0 seconds
> > >  Running cx_Oracle select
> > >  Unicode params took 0.0619998 seconds
> > >  Encoded params took 0 seconds
> > > 
> > > And running the cx_Oracle select first to prevent pollution by
> > > caching:
> > > 
> > >  Running cx_Oracle select
> > >  Unicode params took 0.0469999 seconds
> > >  Encoded params took 0 seconds
> > >  Running select
> > >  Unicode params took 3.267 seconds
> > >  Encoded params took 0.0159998 seconds
> > > 
> > > As can be seen, there's a slight slowdown with the unicode
> > > parameter on cx_Oracle, but nothing compared to the slowdown on
> > > sqlalchemy.  The sqlalchemy unicode parameter select is the only
> > > one which performs a full table scan, so that makes sense.
> > > 
> > > This issue has been mentioned on this list before - 15 Feb with
> > > title "Full Table scan with Oracle due to Charset conversion" and
> > > 4 April with title "Problem with Oracle requests" - but in those
> > > cases, the issue was reproducible with the cx_Oracle driver as
> > > well, which it is not here.
> > > 
> > > Using the convert_unicode='force' argument to the String types
> > > does
> > > solve the issue, but there are a few reasons I don't want to do
> > > that.  This product also supports PostgreSQL, SQL Server and
> > > MySQL, and I do not want to make changes which will affect
> > > running
> > > under those databases.  This is also happening in an otherwise
> > > very stable product under long-term support in a number of
> > > places,
> > > so whatever change is made needs to be minimal.
> > > 
> > > Adding an additional function-based index on SYS_OP_C2C(column)
> > > for
> > > each of the affected columns also solves this issue with a slight
> > > performance change for the extra indexes which need to be kept up
> > > to date.  This is the stopgap solution we've gone for in the
> > > meantime for those systems on production.  It's not ideal,
> > > though,
> > > and a bit difficult to automate and then verify the results.
> > >  We'd
> > > prefer to know why this was happening, and maybe fix a sqlalchemy
> > > issue in the process.
> > > 
> > > So, my questions are:
> > >  * What is sqlalchemy doing differently to straight cx_Oracle
> > >  that
> > >  triggers this behaviour?  How can I debug this further to find
> > >  out?
> > >  * Is there a way, on sqlalchemy 0.6.x, to set up a hook which
> > >  will
> > >  use convert_unicode='force' only on an Oracle database and only
> > >  for String columns, automatically across an application?
> > >   There's
> > >  no create_engine argument for convert_unicode='force', which is
> > >  a
> > >  pity.
> > >  * And a hook for sqlalchemy 0.7.x to achieve the same result?
> > > 
> > > Thanks in advance for any help,
> > > regards,
> > > --
> > > David Moore
> > > Senior Software Engineer
> > > St. James Software
> > > Email: dav...@sjsoft.com
> > > 
> > > --
> > > You received this message because you are subscribed to the
> > > Google
> > > Groups "sqlalchemy" group.
> > > To post to this group, send email to sqlalchemy@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > sqlalchemy+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > > http://groups.google.com/group/sqlalchemy?hl=en.
> > > 
> > > <oracle_sqlalchemy_encoding_error.py>
> > 
> > 
> > --
> > You received this message because you are subscribed to the Google
> > Groups "sqlalchemy" group.
> > To post to this group, send email to sqlalchemy@googlegroups.com.
> > To unsubscribe from this group, send email to
> > sqlalchemy+unsubscr...@googlegroups.com.
> > For more options, visit this group at
> > http://groups.google.com/group/sqlalchemy?hl=en.
> > 
> > 
> 
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> To unsubscribe from this group, send email to
> sqlalchemy+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sqlalchemy?hl=en.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Full table scan using Oracle String indexes

Reply via email to