Re: [sqlalchemy] pyodbc and is_disconnect
Fantastic, thanks so much for the help, Mike. Kind regards, David Moore Support Technical Lead j5 Software South Africa Skype: Phone: Email: Web: davidm.j5int +27 21 762 1440 dav...@j5int.com www.j5int.com [image: j5InternationalLogo_small.png] -- This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received it by mistake, please let us know by e-mail reply and delete it from your system; you may not copy this message or disclose its contents to anyone. Please send us by fax any message containing deadlines as incoming e-mails are not screened for response deadlines. The integrity and security of this message cannot be guaranteed on the Internet.You should carry out your own virus checks before opening any attachments. Opinions, conclusions and other information that do not relate to the official business of the company are neither given nor endorsed by it. On Thu, Sep 28, 2017 at 3:46 PM, Mike Bayer wrote: > > > On Thu, Sep 28, 2017 at 3:58 AM, David Moore wrote: > >> Hi, >> >> I've recently had an issue with pyodbc not correctly identifying a >> disconnect exception when connected to a replicating SQL Server cluster >> with failover. As far as I've been able to ascertain, what happened is that >> the database failed over, leaving all connections in a weird state. Since >> sqlalchemy didn't correctly identify the errors as disconnect exceptions, >> it kept trying to use the connections which never became usable again. We >> recycle connections after an hour, but that left us with an hour of no >> database functionality. Production SQLAlchemy version is 1.0.6, but there >> is no relevant change I can see on latest master. >> >> So, I went digging into how sqlalchemy classifies these errors. It seems >> the only disconnect condition that pyodbc takes special note of is [08S01] >> (in lib/sqlalchemy/connectors/pyodbc.py). Back before we used >> sqlalchemy, we collected a more comprehensive set of errors which imply a >> disconnection event, and I'd love to see these in sqlalchemy. These >> are '01002', '08003', '08007', '08S01','08S02', '08001', 'HYT00' and >> 'HY010'. >> >> So, two questions (assuming that these additions will be welcome): >> * I notice 08S01 is handled in lib/sqlalchemy/connectors/pyodbc.py, >> whereas pymssql looks for disconnect errors in >> lib/sqlalchemy/dialects/pymssql.py. Where is the correct place to patch >> this? The former, or lib/sqlalchemy/dialects/pyodbc.py? >> > > I think the presence of 08S01 in connectors/pyodbc.py is a bug -this error > code is specific to SQL Server, so should be in dialects/mssql/pyodbc. > For that and adding the error codes https://bitbucket.org/zzzeek/ > sqlalchemy/issues/4095/sql-server-close-connection-codes-for is added > should be pushed today. > > > >> * Is there a place I can hook or extend or override our current setup >> to get this detection into production without a full sqlalchemy upgrade >> testing cycle? >> > > big time, there's an exception catch hook with deep functionality: > > http://docs.sqlalchemy.org/en/latest/core/events.html? > highlight=handle_error#sqlalchemy.events.ConnectionEvents.handle_error > > per the example we can illustrate your codes: > > @event.listens_for(Engine, "handle_error")def handle_exception(context): > if isinstance(context.original_exception, > pyodbc.Error): > > for code in ('08S01', '01002', '08003', > '08007', '08S02', '08001', 'HYT00', 'HY010'): > > if code in str(context.original_exception): > > context.is_disconnect = True > > > > >> >> Kind regards, >> >> David Moore >> >> Support Technical Lead >> >> j5 Software South Africa >> >> Skype: >> >> Phone: >> >> Email: >> >> Web: >> >> davidm.j5int >> >> +27 21 762 1440 <+27%2021%20762%201440> >> >> dav...@j5int.com >> >> www.j5int.com >> >> [image: j5InternationalLogo_small.png] >> >> -- >> >> This message is confidential. It may also be privileged or otherwise >> protected by work product immunity or other legal rules. If you have >> received it by mistake, please let us know by e-mail reply and delete it >> from your system; you may not copy this message o
[sqlalchemy] pyodbc and is_disconnect
Hi, I've recently had an issue with pyodbc not correctly identifying a disconnect exception when connected to a replicating SQL Server cluster with failover. As far as I've been able to ascertain, what happened is that the database failed over, leaving all connections in a weird state. Since sqlalchemy didn't correctly identify the errors as disconnect exceptions, it kept trying to use the connections which never became usable again. We recycle connections after an hour, but that left us with an hour of no database functionality. Production SQLAlchemy version is 1.0.6, but there is no relevant change I can see on latest master. So, I went digging into how sqlalchemy classifies these errors. It seems the only disconnect condition that pyodbc takes special note of is [08S01] (in lib/sqlalchemy/connectors/pyodbc.py). Back before we used sqlalchemy, we collected a more comprehensive set of errors which imply a disconnection event, and I'd love to see these in sqlalchemy. These are '01002', '08003', '08007', '08S01','08S02', '08001', 'HYT00' and 'HY010'. So, two questions (assuming that these additions will be welcome): * I notice 08S01 is handled in lib/sqlalchemy/connectors/pyodbc.py, whereas pymssql looks for disconnect errors in lib/sqlalchemy/dialects/pymssql.py. Where is the correct place to patch this? The former, or lib/sqlalchemy/dialects/pyodbc.py? * Is there a place I can hook or extend or override our current setup to get this detection into production without a full sqlalchemy upgrade testing cycle? Kind regards, David Moore Support Technical Lead j5 Software South Africa Skype: Phone: Email: Web: davidm.j5int +27 21 762 1440 dav...@j5int.com www.j5int.com [image: j5InternationalLogo_small.png] -- This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received it by mistake, please let us know by e-mail reply and delete it from your system; you may not copy this message or disclose its contents to anyone. Please send us by fax any message containing deadlines as incoming e-mails are not screened for response deadlines. The integrity and security of this message cannot be guaranteed on the Internet.You should carry out your own virus checks before opening any attachments. Opinions, conclusions and other information that do not relate to the official business of the company are neither given nor endorsed by it. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
[sqlalchemy] Sequences in Oracle RAC out of order
Hello, I just chased down a bug in our application while using SQLAlchemy 1.0.6 in an Oracle RAC environment, which should probably be in SQLAlchemy documentation somewhere, and possibly motivate a change to the options for sequence creation on Oracle. Basically, our application assumes a column we attach a sqlalchemy.Sequence to will always increment, and we use that to process a set of changes in strict order. However, if you read the Oracle docs on SEQUENCE creation (here: https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_6014.htm), in an RAC environment, that is not true by default - each node running your database grabs a cache of sequence numbers to serve out equal to CACHE size (default is 20) and whichever node processes your insert will assign a sequence number from its cache. I understand this is done for performance reasons, and in the very common case that all you want from your Sequence is to autogenerate a unique id, it's absolutely fine. However, it would be nice to have an option to specify the ORDER keyword when creating your SEQUENCE if you really need a Sequence in sequential order, and a documentation note that this is an oddity of Oracle RAC. Kind regards, Dave Moore -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Support for Oracle 12c auto increment (IDENTITY) columns?
- Original Message - From: "Piotr Dobrogost" To: "sqlalchemy" Sent: Wednesday, April 13, 2016 1:50:19 PM Subject: Re: [sqlalchemy] Support for Oracle 12c auto increment (IDENTITY) columns? Mike, Thanks for your reply! On Wednesday, April 13, 2016 at 1:15:32 PM UTC+2, Mike Bayer wrote: We've not started supporting new oracle 12c features as of yet, in this case it might be possible to get it working with some dialect flags since we already use "returning" to get at the newly generated primary key, although testing would be needed and other assumptions in the dialect might get in the way. Which flags do you have in mind? Looking at http://docs.sqlalchemy.org/en/latest/dialects/oracle.html I don't see anything which might be useful to make it work. I was surprised Oracle needs different syntax with explicit sequence in SA. Having one syntax for auto increment column (primary key) across all backends seems like very important feature to have. What is the reason there's no Oracle dialect option to generate and use suitable sequence for such column? Is there some recipe solving this problem? Regards, Piotr Dobrogost -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com . To post to this group, send email to sqlalchemy@googlegroups.com . Visit this group at https://groups.google.com/group/sqlalchemy . For more options, visit https://groups.google.com/d/optout . >From the license under which you can use this ( >http://www.oracle.com/technetwork/licenses/standard-license-152015.html ): License Rights and Restrictions Oracle grants You a nonexclusive, nontransferable, limited license to internally use the Programs, subject to the restrictions stated in this Agreement, only for the purpose of developing, testing, prototyping, and demonstrating Your application and only as long as Your application has not been used for any data processing, business, commercial, or production purposes, and not for any other purpose. -- I suspect you would not be able to use this for developing and testing sqlalchemy without breaking those terms. Oracle can get really nasty about using Developer Days when they think you've broken this agreement. regards, Dave Moore -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Sqlalchemy issue on Windows Server 2008 R2 sp1
- Original Message - > Hi everyone, as the title says I'm trying to run sqlalchemy (pyodbc) > with SQL Server 2008 on Windows Server 2008 R2 sp1 but I'm failing > with the database connection. Here is a small script I'm using to > test the connection: > from sqlalchemy import * > from sqlalchemy.engine import reflection > url = "mssql+pyodbc://user:password@my.server.address/server_test" > e = create_engine(url) > insp = reflection.Inspector.from_engine(e) > print insp.get_table_names() You're probably using a named instance, in which case you need to use the instance name as part of the server name. You can find it in the connection details Management Studio connects to, or in the service name. For instance, the default named instance for SQL Server Express is SQLEXPRESS, so your url above should be: "mssql+pyodbc://user:password@my.server.address\\SQLEXPRESS/server_test" HTH, Dave [snip] -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] Full table scan using Oracle String indexes
Hi Yann, Sorry for the confusion - my email was musing about what I thought sqlalchemy should do, not what I thought you should do. The way we have solved this is using the cx_Oracle inputtypehandler hook. So we've overriden where the connection is created by sqlalchemy, and then used the following snippet: def OracleInputTypeHandler(cursor, value, numElements): if isinstance(value, unicode): return cursor.var(str, arraysize = numElements, inconverter = lambda x: x.encode(cursor.connection.nencoding)) connection.inputtypehandler = OracleInputTypeHandler This converts all unicode bind parameters passed into cx_Oracle into the client encoding. HTH, Dave Moore - Original Message - > Le jeudi 25 octobre 2012 16:21:49 UTC+2, David Moore a écrit : > > Hello, > > Hi, > > In line with what is required for Python 3, would it not make sense > > to insist across the board that bind values to sqlalchemy.String > > should be bytestrings and that bind values to sqlalchemy.Unicode > > should be unicode strings, converting if necessary? > > I am using the Unicode class and not String directly. But the code I > was referring to is in the String class. > > I don't think I understand why you would not want that ever. > > I am using unicode strings, but I'd like sqlalchemy to convert them > to the dialect encoding in order to avoid a problem with oracle > indexes not being used when querying with unicode strings. > > regards, > > > Dave Moore > > > > Hi, > > > > > > Having had the same problem as above, I would like to use the > > > "convert_unicode='force'" flag at engine configuration level. > > > > > > It seems that String tests the value of "convert_unicode" at its > > > own > > > level as well as engine level, but it isn't the case for the > > > 'force' > > > value check : > > > > > > def bind_processor(self, dialect): > > > > > > if self.convert_unicode or dialect.convert_unicode: > > > > > > if dialect.supports_unicode_binds and \ > > > > > > self.convert_unicode != 'force' : > > > > > > ... > > > > > > def result_processor(self, dialect, coltype): > > > > > > wants_unicode = self.convert_unicode or dialect.convert_unicode > > > > > > needs_convert = wants_unicode and \ > > > > > > (dialect.returns_unicode_strings is not True or > > > > > > self.convert_unicode == 'force' ) > > > > > > Thus I have to create a type decorator for all my Strings even if > > > I > > > have set the right flag at engine configuration level... > > > > > > Yann > > > > > > Le jeudi 2 août 2012 23:22:22 UTC+2, Michael Bayer a écrit : > > > > > > > so the "convert_unicode='force'" flag is not ideal here as that > > > > will > > > > spend most of it's time checking for decodes necessary on the > > > > result > > > > set side, which is a lot of wasted effort. You can customize > > > > how > > > > strings are handled on the bind side, including per-dialect > > > > behavior, using a custom type: > > > > > > > > > > class MyStringType(TypeDecorator): > > > > > > > > > > impl = String > > > > > > > > > > def process_bind_param(self, value, dialect): > > > > > > > > > > if value is not None and dialect.name == "oracle": > > > > > > > > > > value = value.encode('utf-8') > > > > > > > > > > then replace usage of the String(length=XYZ) type with > > > > MyStringType(length=XYZ). > > > > > > > > > > docs: > > > > http://docs.sqlalchemy.org/en/rel_0_7/core/types.html#types-custom > > > > > > > > > > On Aug 2, 2012, at 5:41 AM, David Moore wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > Further searching seems to indicate this is an Oracle issue, > > > > > not > > > > > a > > > > > cx_Oracle issue. > > > > > > > > > > > http://www.digipedia.pl/usenet/thread/15912/2814/ - same > > > > > problem > > > > &g
Re: [sqlalchemy] Full table scan using Oracle String indexes
Hello, In line with what is required for Python 3, would it not make sense to insist across the board that bind values to sqlalchemy.String should be bytestrings and that bind values to sqlalchemy.Unicode should be unicode strings, converting if necessary? I don't think I understand why you would not want that ever. regards, Dave Moore - Original Message - > Hi, > Having had the same problem as above, I would like to use the > "convert_unicode='force'" flag at engine configuration level. > It seems that String tests the value of "convert_unicode" at its own > level as well as engine level, but it isn't the case for the 'force' > value check : > def bind_processor(self, dialect): > if self.convert_unicode or dialect.convert_unicode: > if dialect.supports_unicode_binds and \ > self.convert_unicode != 'force' : > ... > def result_processor(self, dialect, coltype): > wants_unicode = self.convert_unicode or dialect.convert_unicode > needs_convert = wants_unicode and \ > (dialect.returns_unicode_strings is not True or > self.convert_unicode == 'force' ) > Thus I have to create a type decorator for all my Strings even if I > have set the right flag at engine configuration level... > Yann > Le jeudi 2 août 2012 23:22:22 UTC+2, Michael Bayer a écrit : > > so the "convert_unicode='force'" flag is not ideal here as that > > will > > spend most of it's time checking for decodes necessary on the > > result > > set side, which is a lot of wasted effort. You can customize how > > strings are handled on the bind side, including per-dialect > > behavior, using a custom type: > > > class MyStringType(TypeDecorator): > > > impl = String > > > def process_bind_param(self, value, dialect): > > > if value is not None and dialect.name == "oracle": > > > value = value.encode('utf-8') > > > then replace usage of the String(length=XYZ) type with > > MyStringType(length=XYZ). > > > docs: > > http://docs.sqlalchemy.org/en/rel_0_7/core/types.html#types-custom > > > On Aug 2, 2012, at 5:41 AM, David Moore wrote: > > > > Hi, > > > > > > > > Further searching seems to indicate this is an Oracle issue, not > > > a > > > cx_Oracle issue. > > > > http://www.digipedia.pl/usenet/thread/15912/2814/ - same problem > > > with cx_Oracle > > > > http://support.unify.com/supportforum/viewtopic.php?f=40&t=3823 - > > > happening with perl drivers. > > > > http://tao.qshine.com/note/note_ora.htm - happening with ADO.NET > > > drivers. > > > > > > > > There's also evidence this problem occurs with MS SQL Server as > > > well: > > > > http://www.microsoftfaqs.com/Articles/1148439/Index_not_used_when_UNICODE_%3D_true > > > - SQL Server 2005 > > > > http://blog.brianhartsock.com/2008/12/14/nvarchar-vs-varchar-in-sql-server-beware/ > > > > > > > > Although, it seems in SQL Server at least, it still uses the > > > index, > > > just in a slower manner, which is probably why we haven't noticed > > > this issue yet. > > > > > > > > It seems to me that the place to fix this is when the type of the > > > column is known, which is when sqlalchemy binds the parameter. > > > There's some discussion in one of those links that, at the sql > > > statement preparation level, you don't know what the column > > > datatype is, so you can't fix it there. > > > > > > > > Would it be possible for String datatypes to detect unicode > > > values > > > and encode them with the database charset? > > > > > > > > As a lower-level temporary solution, you can get cx_Oracle to > > > encode all unicode parameters to strings, but that's obviously > > > the > > > wrong thing to do if you have any Unicode columns. This snippet: > > > > > > > > def InputTypeHandler(cursor, value, numElements): > > > > if isinstance(value, unicode): > > > > return cursor.var(str, arraysize = numElements, > > > > inconverter = lambda x: x.encode(cursor.connection.nencoding)) > > > > > > > > And then when creating the connection: > > > > > > > > connection.inputtypehandler = InputTypeHandler > > > > > > > > thanks for the help, > > > > > > > > -- > > &
Re: [sqlalchemy] Postgres migration issue
- Original Message - > Hi > I usually use MySQL to develop on, however I need to work with > Postgres for the first time today so I fired it up. I have a routine > which converts a non-SQL database into the database of choice, > converting its schema into a new table in the target database using > SQA, and then copies all the data in the source database into the > new SQL table. > That all worked fine into the Postgres+pg8000 database. My problem is > when I then attempt to open up a table again using auto reflection I > get an error I've never seen before, and I don't get how this can > be, given the table was created via sqlalchemy? The data looks fine > in the table, and all columns are created as I expected (converting > to the correct Postrgres column types etc. > Error when I issue t = Table('my_table', meta, autoload=True) is; > (sorry about the screen shot, I'm working in a bad RDP client and > can't cut/paste into my Mac. :-( > So it appears to be having some problem in the reflection, but I > can't see why - I hope there is a setting in the connection or > something I can do to fix this up? I've never used Postgres before, > so I'm groping in the dark.. From Googling around, it appears that > there is some issue with determining the schema or some such, but > it's all assuming I know a lot more about Postgres than I do! > Cheers > Warwick Hi Warwick, You are using pg8000 1.08 and PostgreSQL >= 9.0. Upgrade to pg8000 1.09, it fixes this issue (there are new PostgreSQL types introduced in version 9 which pg8000 didn't know of in 1.08, and added in 1.09). regards -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. <>
Re: [sqlalchemy] Full table scan using Oracle String indexes
Hi, Further searching seems to indicate this is an Oracle issue, not a cx_Oracle issue. http://www.digipedia.pl/usenet/thread/15912/2814/ - same problem with cx_Oracle http://support.unify.com/supportforum/viewtopic.php?f=40&t=3823 - happening with perl drivers. http://tao.qshine.com/note/note_ora.htm - happening with ADO.NET drivers. There's also evidence this problem occurs with MS SQL Server as well: http://www.microsoftfaqs.com/Articles/1148439/Index_not_used_when_UNICODE_%3D_true - SQL Server 2005 http://blog.brianhartsock.com/2008/12/14/nvarchar-vs-varchar-in-sql-server-beware/ Although, it seems in SQL Server at least, it still uses the index, just in a slower manner, which is probably why we haven't noticed this issue yet. It seems to me that the place to fix this is when the type of the column is known, which is when sqlalchemy binds the parameter. There's some discussion in one of those links that, at the sql statement preparation level, you don't know what the column datatype is, so you can't fix it there. Would it be possible for String datatypes to detect unicode values and encode them with the database charset? As a lower-level temporary solution, you can get cx_Oracle to encode all unicode parameters to strings, but that's obviously the wrong thing to do if you have any Unicode columns. This snippet: def InputTypeHandler(cursor, value, numElements): if isinstance(value, unicode): return cursor.var(str, arraysize = numElements, inconverter = lambda x: x.encode(cursor.connection.nencoding)) And then when creating the connection: connection.inputtypehandler = InputTypeHandler thanks for the help, -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com - Original Message - > And for anyone else experiencing this issue, there was a subtle > difference in the execution plans that's now apparent. The > statement which only selects colid runs a FAST FULL SCAN and > successfully converts the unicode parameter using SYS_OP_C2C, and > uses the index. When you select both colid and message, it runs a > FULL SCAN, and even though it seems to detect the conversion can be > done, it does not use the index at all. Perhaps this is actually an > Oracle issue? > > -- > David Moore > Senior Software Engineer > St. James Software > Email: dav...@sjsoft.com > > > > - Original Message - > > Hi Michael, > > > > Indeed, you are correct - adding the message column to the > > cx_oracle > > query shows the same behaviour as the sqlalchemy query. Sorry I > > missed that. > > > > I will take these results to the cx_oracle list on this basis. On > > my > > second question, though, is there a way to enforce > > convert_unicode='force' only on Oracle databases across an > > application? I know the hooking syntax changed from sqlalchemy 0.6 > > to 0.7, so I'd imagine if there is a way, it would be different for > > those two versions. > > > > thanks, > > > > -- > > David Moore > > Senior Software Engineer > > St. James Software > > Email: dav...@sjsoft.com > > > > > > - Original Message - > > > Hi David - > > > > > > I've done some experiments with the script you gave me. Correct > > > me > > > if I'm wrong (which is very possible), but the attached revised > > > version appears to demonstrate the difference between the > > > cx_oracle > > > and SQLAlchemy versions is really just that the SQLAlchemy > > > version > > > is running a different SQL string: > > > > > > SELECT test_table.colid, test_table.message > > > FROM test_table > > > WHERE test_table.colid = :colid_1 > > > > > > whereas the cx_oracle version, critically, does not ask for the > > > "message" column: > > > > > > SELECT test_table.colid FROM test_table WHERE test_table.colid = > > > :colid > > > > > > it's not clear to me if the placement of the "message" column > > > impacts > > > the Oracle planner, or otherwise if the latency is on the > > > cx_oracle > > > side regarding buffering of columns or something similar. But > > > i > > > am able to get the SQLAlchemy version to be as fast as the > > > cx_oracle > > > version, and the cx_oracle version to be as slow as the > > > sqlalchemy > > > version, by just adding/removing the "message" column from the > > > query. > > > > > > The attached script als
Re: [sqlalchemy] Full table scan using Oracle String indexes
And for anyone else experiencing this issue, there was a subtle difference in the execution plans that's now apparent. The statement which only selects colid runs a FAST FULL SCAN and successfully converts the unicode parameter using SYS_OP_C2C, and uses the index. When you select both colid and message, it runs a FULL SCAN, and even though it seems to detect the conversion can be done, it does not use the index at all. Perhaps this is actually an Oracle issue? -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com - Original Message - > Hi Michael, > > Indeed, you are correct - adding the message column to the cx_oracle > query shows the same behaviour as the sqlalchemy query. Sorry I > missed that. > > I will take these results to the cx_oracle list on this basis. On my > second question, though, is there a way to enforce > convert_unicode='force' only on Oracle databases across an > application? I know the hooking syntax changed from sqlalchemy 0.6 > to 0.7, so I'd imagine if there is a way, it would be different for > those two versions. > > thanks, > > -- > David Moore > Senior Software Engineer > St. James Software > Email: dav...@sjsoft.com > > > - Original Message - > > Hi David - > > > > I've done some experiments with the script you gave me. Correct me > > if I'm wrong (which is very possible), but the attached revised > > version appears to demonstrate the difference between the cx_oracle > > and SQLAlchemy versions is really just that the SQLAlchemy version > > is running a different SQL string: > > > > SELECT test_table.colid, test_table.message > > FROM test_table > > WHERE test_table.colid = :colid_1 > > > > whereas the cx_oracle version, critically, does not ask for the > > "message" column: > > > > SELECT test_table.colid FROM test_table WHERE test_table.colid = > > :colid > > > > it's not clear to me if the placement of the "message" column > > impacts > > the Oracle planner, or otherwise if the latency is on the cx_oracle > > side regarding buffering of columns or something similar. But i > > am able to get the SQLAlchemy version to be as fast as the > > cx_oracle > > version, and the cx_oracle version to be as slow as the sqlalchemy > > version, by just adding/removing the "message" column from the > > query. > > > > The attached script also tries to eliminate the overhead of > > SQLAlchemy compiling the SQL construct within the timed portion, as > > I was testing this against only 40K rows and wanted to try to get > > the results as close as possible. It's only dealing with the > > unicode parameter, and it's true if you encode the parameter first, > > both the SQLA and cx_oracle versions get faster, even to the point > > that the two-column version minus unicode is faster than the one > > column version with unicode. My guess is that cx_oracle is > > detecting a unicode value in the input and doing something > > different > > with the statement overall as a result of it being present. > > > > Some typical set of results are: > > > > Setting up table > > sa_both_cols 1.20209717751 > > sa_one_col 0.0182020664215 > > sa_literal_sql 0.019690990448 > > cx_oracle_one_col 0.0839619636536 > > cx_oracle_both_cols 1.18438816071 > > > > Setting up table > > sa_both_cols 1.17753505707 > > sa_one_col 0.0785720348358 > > sa_literal_sql 0.016618013382 > > cx_oracle_one_col 0.0194280147552 > > cx_oracle_both_cols 1.15302705765 > > > > while the cx_oracle/SQLA versions seem to compete for the "fast" > > version, which is because we're only running the statement once and > > I'm running on a low performing amazon small instance, it's clear > > the "two column" version is where the vast amount of latency occurs > > for both systems. > > > > > > let me know if you can confirm similar results on your end. I > > tested > > only with SQLAlchemy 0.8 in trunk but I was able to reproduce your > > initial results so I assume the experience with 0.7, 0.6 would be > > similar. > > > > > > > > > > > > > > > > > > > > On Aug 1, 2012, at 6:40 AM, David Moore wrote: > > > > > Hi all, > > > > > > This is a bug in a product with multiple versions in support. > > > The > > > issue can be reproduced with > > &
Re: [sqlalchemy] Full table scan using Oracle String indexes
Hi Michael, Indeed, you are correct - adding the message column to the cx_oracle query shows the same behaviour as the sqlalchemy query. Sorry I missed that. I will take these results to the cx_oracle list on this basis. On my second question, though, is there a way to enforce convert_unicode='force' only on Oracle databases across an application? I know the hooking syntax changed from sqlalchemy 0.6 to 0.7, so I'd imagine if there is a way, it would be different for those two versions. thanks, -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com - Original Message - > Hi David - > > I've done some experiments with the script you gave me. Correct me > if I'm wrong (which is very possible), but the attached revised > version appears to demonstrate the difference between the cx_oracle > and SQLAlchemy versions is really just that the SQLAlchemy version > is running a different SQL string: > > SELECT test_table.colid, test_table.message > FROM test_table > WHERE test_table.colid = :colid_1 > > whereas the cx_oracle version, critically, does not ask for the > "message" column: > > SELECT test_table.colid FROM test_table WHERE test_table.colid = > :colid > > it's not clear to me if the placement of the "message" column impacts > the Oracle planner, or otherwise if the latency is on the cx_oracle > side regarding buffering of columns or something similar. But i > am able to get the SQLAlchemy version to be as fast as the cx_oracle > version, and the cx_oracle version to be as slow as the sqlalchemy > version, by just adding/removing the "message" column from the > query. > > The attached script also tries to eliminate the overhead of > SQLAlchemy compiling the SQL construct within the timed portion, as > I was testing this against only 40K rows and wanted to try to get > the results as close as possible. It's only dealing with the > unicode parameter, and it's true if you encode the parameter first, > both the SQLA and cx_oracle versions get faster, even to the point > that the two-column version minus unicode is faster than the one > column version with unicode. My guess is that cx_oracle is > detecting a unicode value in the input and doing something different > with the statement overall as a result of it being present. > > Some typical set of results are: > > Setting up table > sa_both_cols 1.20209717751 > sa_one_col 0.0182020664215 > sa_literal_sql 0.019690990448 > cx_oracle_one_col 0.0839619636536 > cx_oracle_both_cols 1.18438816071 > > Setting up table > sa_both_cols 1.17753505707 > sa_one_col 0.0785720348358 > sa_literal_sql 0.016618013382 > cx_oracle_one_col 0.0194280147552 > cx_oracle_both_cols 1.15302705765 > > while the cx_oracle/SQLA versions seem to compete for the "fast" > version, which is because we're only running the statement once and > I'm running on a low performing amazon small instance, it's clear > the "two column" version is where the vast amount of latency occurs > for both systems. > > > let me know if you can confirm similar results on your end. I tested > only with SQLAlchemy 0.8 in trunk but I was able to reproduce your > initial results so I assume the experience with 0.7, 0.6 would be > similar. > > > > > > > > > > On Aug 1, 2012, at 6:40 AM, David Moore wrote: > > > Hi all, > > > > This is a bug in a product with multiple versions in support. The > > issue can be reproduced with > > sqlalchemy: 0.6.1, 0.6.6, 0.7.8 > > cx_Oracle: 5.0.4, 5.1, 5.1.2 > > Oracle: 10g XE and 10g Enterprise > > > > We have tables created with sqlalchemy.String primary keys and > > other indexes. When querying through sqlalchemy, when we pass in > > a unicode value for the key, Oracle does not use the index, and > > instead uses a full table scan. When executing the exact same > > query through cx_Oracle, we don't see this issue. The script > > attached is a small test case which demonstrates the issue. > > Sample output: > > > > Running select > > Unicode params took 3.455 seconds > > Encoded params took 0 seconds > > Running cx_Oracle select > > Unicode params took 0.0619998 seconds > > Encoded params took 0 seconds > > > > And running the cx_Oracle select first to prevent pollution by > > caching: > > > > Running cx_Oracle select > > Unicode params took 0.046 seconds > > Encoded params took 0 seconds > > Running select > > Unicode params took 3.267
[sqlalchemy] Full table scan using Oracle String indexes
Hi all, This is a bug in a product with multiple versions in support. The issue can be reproduced with sqlalchemy: 0.6.1, 0.6.6, 0.7.8 cx_Oracle: 5.0.4, 5.1, 5.1.2 Oracle: 10g XE and 10g Enterprise We have tables created with sqlalchemy.String primary keys and other indexes. When querying through sqlalchemy, when we pass in a unicode value for the key, Oracle does not use the index, and instead uses a full table scan. When executing the exact same query through cx_Oracle, we don't see this issue. The script attached is a small test case which demonstrates the issue. Sample output: Running select Unicode params took 3.455 seconds Encoded params took 0 seconds Running cx_Oracle select Unicode params took 0.0619998 seconds Encoded params took 0 seconds And running the cx_Oracle select first to prevent pollution by caching: Running cx_Oracle select Unicode params took 0.046 seconds Encoded params took 0 seconds Running select Unicode params took 3.267 seconds Encoded params took 0.0159998 seconds As can be seen, there's a slight slowdown with the unicode parameter on cx_Oracle, but nothing compared to the slowdown on sqlalchemy. The sqlalchemy unicode parameter select is the only one which performs a full table scan, so that makes sense. This issue has been mentioned on this list before - 15 Feb with title "Full Table scan with Oracle due to Charset conversion" and 4 April with title "Problem with Oracle requests" - but in those cases, the issue was reproducible with the cx_Oracle driver as well, which it is not here. Using the convert_unicode='force' argument to the String types does solve the issue, but there are a few reasons I don't want to do that. This product also supports PostgreSQL, SQL Server and MySQL, and I do not want to make changes which will affect running under those databases. This is also happening in an otherwise very stable product under long-term support in a number of places, so whatever change is made needs to be minimal. Adding an additional function-based index on SYS_OP_C2C(column) for each of the affected columns also solves this issue with a slight performance change for the extra indexes which need to be kept up to date. This is the stopgap solution we've gone for in the meantime for those systems on production. It's not ideal, though, and a bit difficult to automate and then verify the results. We'd prefer to know why this was happening, and maybe fix a sqlalchemy issue in the process. So, my questions are: * What is sqlalchemy doing differently to straight cx_Oracle that triggers this behaviour? How can I debug this further to find out? * Is there a way, on sqlalchemy 0.6.x, to set up a hook which will use convert_unicode='force' only on an Oracle database and only for String columns, automatically across an application? There's no create_engine argument for convert_unicode='force', which is a pity. * And a hook for sqlalchemy 0.7.x to achieve the same result? Thanks in advance for any help, regards, -- David Moore Senior Software Engineer St. James Software Email: dav...@sjsoft.com -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. # -*- coding: utf-8 -*- import sqlalchemy import sys import time import random import cx_Oracle user = "j5" dsn = "XE" def create_id(): return unicode(int(time.time())) + unicode(random.randint(10, 99)) def setup_database(metadata): # Create table table = sqlalchemy.Table("test_table", metadata, sqlalchemy.Column("colid", sqlalchemy.String(255), primary_key=True), sqlalchemy.Column("message", sqlalchemy.String(4000))) if not table.exists(): table.create(checkfirst=True) # Add rows for i in range(10): latest_id = create_id() try: table.insert({"colid":latest_id, "message":"This shouldn't matter "*180}).execute() except Exception as e: print "duplicate error", e else: latest_id = table.select().limit(1).execute().fetchone()['colid'] if isinstance(latest_id, str): latest_id = latest_id.decode('utf-8') return table, latest_id def run_oracle_unicode_params(table, id_): assert table.select().where(table.c.colid == id_).execute().fetchone()[0] == id_ def run_oracle_encoded_params(table, id_): assert table.select().where(table.c.colid == id_.encode("utf-8")).execute().fetchone()[0] == id_ def main()