Re: [sqlalchemy] Is Textual SQL DB neutral
On a side note the literal_column change worked fine performance wise allowing me to remove the literal_bind/compile step. I guess I am 50/50 on which is cleaner. Is either one more advantageous from a SQL injection perspective? SQL Injection is not a primary concern at the moment for this process as what we are dealing with is a data replication service pulling from a u2 data store to sql server. The delete is part of batch cleanup testing. On Thursday, August 31, 2017 at 5:13:58 PM UTC-4, Ken MacKenzie wrote: > > Tried doing various conversions on the pk values as they enter the > statement: > 1. to bytes > 2. to ascii > 3. to latin1 (technically the same encoding as the extract source before > entering the db) > > None of which yielded a performance improvement for the non-compiled > version. > > I have read that this can be an issue with pyodbc and that there are > engine settings related to it. Also perhaps I should try using the pymssql > driver to see if that changes anything. > > On Thursday, August 31, 2017 at 5:00:47 PM UTC-4, Ken MacKenzie wrote: >> >> So inspecting the elements of the tuple, they are both str, so hence >> unicode. >> >> Are you saying that if I convert those values to bytes it could improve >> performance? >> >> >> >>> I'd not bother with the literal_binds and just use a literal value: >>> >>> pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] >>> >>> but also I'd look to see what the nature of "v" is, if it's like a >>> Unicode object or something, you might be getting bogged down on the >>> decode/encode or something like that. Sending as bytes() perhaps >>> might change that. >>> >> -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
Tried doing various conversions on the pk values as they enter the statement: 1. to bytes 2. to ascii 3. to latin1 (technically the same encoding as the extract source before entering the db) None of which yielded a performance improvement for the non-compiled version. I have read that this can be an issue with pyodbc and that there are engine settings related to it. Also perhaps I should try using the pymssql driver to see if that changes anything. On Thursday, August 31, 2017 at 5:00:47 PM UTC-4, Ken MacKenzie wrote: > > So inspecting the elements of the tuple, they are both str, so hence > unicode. > > Are you saying that if I convert those values to bytes it could improve > performance? > > > >> I'd not bother with the literal_binds and just use a literal value: >> >> pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] >> >> but also I'd look to see what the nature of "v" is, if it's like a >> Unicode object or something, you might be getting bogged down on the >> decode/encode or something like that. Sending as bytes() perhaps >> might change that. >> > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
So inspecting the elements of the tuple, they are both str, so hence unicode. Are you saying that if I convert those values to bytes it could improve performance? > I'd not bother with the literal_binds and just use a literal value: > > pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] > > but also I'd look to see what the nature of "v" is, if it's like a > Unicode object or something, you might be getting bogged down on the > decode/encode or something like that. Sending as bytes() perhaps > might change that. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
On Thu, Aug 31, 2017 at 2:19 PM, Ken MacKenziewrote: > So here is my solution as of now. As far as I can tell hits the marks of: > > 1. DB Neutral > 2. Performance (9k records deleted in 5 seconds) > 3. Generic > > (excuse the print statements still there for testing) > > Would welcome comments on this proposed solution. I have learned a lot from > the advice in this thread and can always stand to learn more for the sake of > my code. > > @classmethod > def dict_delete(cls, dict_in): > """ > Delete records from the database > based on a dictionary keyed by > PK tuple > """ > s = dbc.get_session() > > d_start = datetime.datetime.now() > keys_in = [k for k in dict_in.keys()] > batch_size = 1000 > cols = [getattr(cls, colname) for colname in cls.SQL_PK] > while len(keys_in): > id_batch = keys_in[:batch_size] > del keys_in[:batch_size] > > or_cond = [] > for x in id_batch: > > pkf = [(col == v) for (col, v) in zip(cols, x)] > and_cond = and_(*pkf) > or_cond.append(and_cond) > > stmt = cls.__table__.delete(or_(*or_cond)) > stmt_comp = str(stmt.compile(dialect=s.bind.dialect, > compile_kwargs={"literal_binds": True})) I'd not bother with the literal_binds and just use a literal value: pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] but also I'd look to see what the nature of "v" is, if it's like a Unicode object or something, you might be getting bogged down on the decode/encode or something like that. Sending as bytes() perhaps might change that. > > print(stmt_comp) > print(len(keys_in), "-", datetime.datetime.now() - d_start) > > s.execute(stmt_comp) > > s.commit() > > s.close() > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
So here is my solution as of now. As far as I can tell hits the marks of: 1. DB Neutral 2. Performance (9k records deleted in 5 seconds) 3. Generic (excuse the print statements still there for testing) Would welcome comments on this proposed solution. I have learned a lot from the advice in this thread and can always stand to learn more for the sake of my code. @classmethod def dict_delete(cls, dict_in): """ Delete records from the database based on a dictionary keyed by PK tuple """ s = dbc.get_session() d_start = datetime.datetime.now() keys_in = [k for k in dict_in.keys()] batch_size = 1000 cols = [getattr(cls, colname) for colname in cls.SQL_PK] while len(keys_in): id_batch = keys_in[:batch_size] del keys_in[:batch_size] or_cond = [] for x in id_batch: pkf = [(col == v) for (col, v) in zip(cols, x)] and_cond = and_(*pkf) or_cond.append(and_cond) stmt = cls.__table__.delete(or_(*or_cond)) stmt_comp = str(stmt.compile(dialect=s.bind.dialect, compile_kwargs={"literal_binds": True})) print(stmt_comp) print(len(keys_in), "-", datetime.datetime.now() - d_start) s.execute(stmt_comp) s.commit() s.close() -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
Ending output from batch size of 10: 501 - 0:24:35.558103 491 - 0:24:37.197976 481 - 0:24:38.909318 471 - 0:24:40.645801 461 - 0:24:41.954807 451 - 0:24:43.628955 441 - 0:24:45.246469 431 - 0:24:46.662228 421 - 0:24:48.192343 411 - 0:24:49.839395 401 - 0:24:51.513340 391 - 0:24:53.153668 381 - 0:24:54.836501 371 - 0:24:56.372331 361 - 0:24:58.034510 351 - 0:24:59.714655 341 - 0:25:01.401695 331 - 0:25:03.026082 321 - 0:25:04.657724 311 - 0:25:06.324783 301 - 0:25:07.946133 291 - 0:25:09.347907 281 - 0:25:11.000936 271 - 0:25:12.640046 261 - 0:25:14.281258 251 - 0:25:15.939942 241 - 0:25:17.651838 231 - 0:25:19.307522 221 - 0:25:20.931354 211 - 0:25:22.589152 201 - 0:25:24.223949 191 - 0:25:25.868513 181 - 0:25:27.540003 171 - 0:25:29.198259 161 - 0:25:30.492486 151 - 0:25:31.778539 141 - 0:25:33.292928 131 - 0:25:34.966223 121 - 0:25:36.623493 111 - 0:25:38.276205 101 - 0:25:39.996527 91 - 0:25:41.374338 81 - 0:25:43.011192 71 - 0:25:44.639656 61 - 0:25:46.293672 51 - 0:25:47.954129 41 - 0:25:49.663463 31 - 0:25:51.333620 21 - 0:25:52.842479 11 - 0:25:54.489094 1 - 0:25:56.109371 0 - 0:25:56.431443 Basically the only conclusion I can draw is that it is a case of the bind params. So I know this is not recommended but to retain DB neutrality can I use the output from query.statement.compile(dialect=s.bind.dialect, compile_kwargs={"literal_binds": True}) and feed that output back in direct to execute? On Thursday, August 31, 2017 at 6:41:17 AM UTC-4, Ken M wrote: > > I tried a batch size as small as ten and each step took about 2 seconds. I > set it up to show me the time between each batch for that run. Lowering > the batch size actually seems to worsen performance. Which is not what I > expected. > > Sent from my iPad > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
I tried a batch size as small as ten and each step took about 2 seconds. I set it up to show me the time between each batch for that run. Lowering the batch size actually seems to worsen performance. Which is not what I expected. Sent from my iPad > On Aug 31, 2017, at 5:42 AM, Simon Kingwrote: > > I'm pretty sure the bad performance is due to pyodbc (or maybe SQL > Server) not handling *huge* numbers of bind parameters gracefully. You > are generating a query with (batch_size * n_primary_key_columns) bind > parameters, which even in your original version meant 2000 bind > parameters. > > Try with a batch sizes of 10, 50 and 100. You'll run more queries but > I suspect each query will perform better. > > Simon > >> On Wed, Aug 30, 2017 at 10:02 PM, Ken MacKenzie wrote: >> With a batch size of 1 based on the original selection, no improvement. >> >> I am going to see what sql this is creating. >> >> 2328928 function calls (2085124 primitive calls) in 1084.925 >> seconds >> >> Ordered by: internal time >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 26 1074.733 41.336 1074.733 41.336 {method 'execute' of >> 'pyodbc.Cursor' objects} >>20.8710.4360.8720.436 {built-in method connect} >> 349128/3485230.5560.0000.5730.000 {built-in method >> isinstance} >> 260.5020.0190.5020.019 {method 'close' of >> 'pyodbc.Cursor' objects} >> 200.4850.0240.4850.024 {method 'commit' of >> 'pyodbc.Connection' objects} >> 161062/1610600.3170.0000.3180.000 {built-in method getattr} >>191230.2590.0000.7890.000 elements.py:861(__init__) >> 95670.2580.0000.2580.000 {method 'fetchone' of >> 'pyodbc.Cursor' objects} >> 76492/191230.2480.0002.8420.000 operators.py:297(__eq__) >> 67012/240.2360.0002.6010.108 >> visitors.py:75(_compiler_dispatch) >>191230.2340.0000.4240.000 elements.py:2927(__init__) >> 95820.2130.0001.1040.000 elements.py:1860(_construct) >>191320.2050.0000.4350.000 compiler.py:676(visit_column) >>191230.1890.0001.8060.000 >> compiler.py:1040(_generate_generic_binary) >>191230.1830.0001.2630.000 >> default_comparator.py:290(_check_literal) >> 57389/191430.1730.0002.7320.000 {built-in method eq} >>191230.1710.0001.9120.000 >> default_comparator.py:22(_boolean_compare) >>286840.1690.0000.3460.000 >> operators.py:1007(is_precedent) >>191260.1430.0000.3520.000 elements.py:3962(__new__) >>191230.1290.0000.7950.000 >> compiler.py:1164(visit_bindparam) >>191230.1280.0000.9180.000 elements.py:3818(_bind_param) >>191530.1260.0000.5810.000 base.py:1128(decorate) >>10.1250.125 1084.925 1084.925 del_test.py:1() >> 1168450.1210.0000.1210.000 {method 'get' of 'dict' >> objects} >> 38264/95810.1150.0002.5670.000 compiler.py:806() >>191230.1130.0002.0480.000 compiler.py:1006(visit_binary) >>191250.1130.0000.1740.000 >> compiler.py:1265(_process_anon) >>191250.1090.0000.4510.000 >> compiler.py:1246(_truncated_identifier) >>191230.1080.0002.4570.000 annotation.py:100(__eq__) >>191230.1070.0002.1900.000 base.py:1306(visit_binary) >>191230.1040.0002.0150.000 type_api.py:60(operate) >>382830.1010.0000.1670.000 compiler.py:2901(quote) >>191230.0950.0000.5770.000 >> compiler.py:1233(_truncate_bindparam) >>192060.0940.0000.1630.000 elements.py:3918(__new__) >>191250.0930.0000.3230.000 elements.py:4073(apply_map) >>382460.0910.0000.0910.000 elements.py:640(self_group) >>884600.0880.0000.0880.000 {method 'append' of 'list' >> objects} >>191230.0840.0002.6210.000 properties.py:269(operate) >> 38264/95810.0840.0002.5860.000 compiler.py:804() >>191230.0820.0000.7320.000 >> annotation.py:78(_compiler_dispatch) >> 12703/31420.0760.0002.6050.001 {method 'join' of 'str' >> objects} >>191230.0750.0000.1290.000 >> compiler.py:959(_get_operator_dispatch) >>10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete) >>191230.0700.0000.0890.000 >> compiler.py:1271(bindparam_string) >>191230.0700.0002.0850.000 :1() >> 95810.0670.0000.5200.000 elements.py:1883() >> 95610.0640.0002.9050.000 db_base.py:208() >> 23611/233230.0620.0000.0670.000 {built-in method
Re: [sqlalchemy] Is Textual SQL DB neutral
Yes the double quote was a typo, sorry about that. Your point about non varchar pk, that is why I will double check other dbs. SQL server lets it fly but I think you are right about Postgres. Sent from my iPad > On Aug 31, 2017, at 5:36 AM, Simon Kingwrote: > > The generic version would look something like this: > > table = cls.__table__ > pkconditions = [] > for pk in cls.SQL_PK: >pkconditions.append(table.c[pk] == sa.bindparam(pk) > condition = sa.and_(*pkconditions) > statement = sa.delete(table, whereclause=condition) > batchparams = [dict(zip(cls.SQL_PK, v)) for v in id_batch] > session.execute(statement, batchparams) > > As for the portability of the textual version: your original code > produced fragments like this: > > field == 'value' > > a) Is the double-equals a typo, or is it syntax that SQL Server > expects? I've never used double-equals in MySQL. > b) Are the primary key fields always strings? You later said that they > were, but if you ever had an integer primary key column, comparing it > against a quoted value would be an error on postgres (I believe). > c) Can the values that you are comparing against ever contain single > quotes? That would break your query and potentially leave you open to > SQL injection. > d) Are your values unicode strings or byte strings? If unicode, does > the encoding match what the database expects? > > If none of those are issues for you, the textual version is probably > pretty safe. > > Simon > >> On Wed, Aug 30, 2017 at 5:30 PM, Ken MacKenzie wrote: >> Not sure how I would iterate through a non predetermined number of primary >> keys. >> >> I guess part of me is wondering that although textual sql is not inherently >> db neutral how different between the db targets is the where field = 'value' >> syntax? >> >>> On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: >>> >>> You could also try using executemany: >>> >>> >>> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements >>> >>> I think it would look something like this: >>> >>> table = cls.__table__ >>> condition = sa.and_( >>>table.c.pk1 == sa.bindparam('pk1'), >>>table.c.pk2 == sa.bindparam('pk2'), >>> ) >>> statement = sa.delete(table, whereclause=condition) >>> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] >>> session.execute(statement, batchparams) >>> >>> Simon >>> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie wrote: After the current sorted profile finishes I will revert to the textual version and run a profile on that. I expect another 10-15 minutes for this to finish right now. At present the batch size is set to 1000, total record count is just over 9000 in these tests. The reason for 1000 was at first I was looking at doing this as a tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep me under most DB restrictions on the in statement. However since SQL Server does not seem to support the tuple_ usage I reverted to this method. I technically have one more method and that is a concat_ in_ where I concat the fields. Other specifics, the table in question has 2 fields for the PK, both are varchar, one length 3, the other length 10. There are 5 non key fields, 3 short varchars, one decimal at 14,2 precision and one varchar(800) which contains description text. Total record count of the table before any deletion is about 1.05 million. Python version is 3.4.5, running on a modest CentOS desktop and to be fair the SQL Server instance is sub optimal for development. > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > > It would be interesting to see the profile of the textual SQL version. > It looks like most of the time is being spent inside pyodbc, rather > than SQLAlchemy, so I guess it must be something to do with the > processing of bind parameters. How many parameters are being sent in > per query? ie. what is len(id_batch) * len(cls.SQL_PK)? > > You could try playing with your batch sizes to see what sort of effect > that has. > > Simon -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com. To post to this group, send email to sqlal...@googlegroups.com. Visit
Re: [sqlalchemy] Is Textual SQL DB neutral
I'm pretty sure the bad performance is due to pyodbc (or maybe SQL Server) not handling *huge* numbers of bind parameters gracefully. You are generating a query with (batch_size * n_primary_key_columns) bind parameters, which even in your original version meant 2000 bind parameters. Try with a batch sizes of 10, 50 and 100. You'll run more queries but I suspect each query will perform better. Simon On Wed, Aug 30, 2017 at 10:02 PM, Ken MacKenziewrote: > With a batch size of 1 based on the original selection, no improvement. > > I am going to see what sql this is creating. > > 2328928 function calls (2085124 primitive calls) in 1084.925 > seconds > >Ordered by: internal time > >ncalls tottime percall cumtime percall filename:lineno(function) >26 1074.733 41.336 1074.733 41.336 {method 'execute' of > 'pyodbc.Cursor' objects} > 20.8710.4360.8720.436 {built-in method connect} > 349128/3485230.5560.0000.5730.000 {built-in method > isinstance} >260.5020.0190.5020.019 {method 'close' of > 'pyodbc.Cursor' objects} >200.4850.0240.4850.024 {method 'commit' of > 'pyodbc.Connection' objects} > 161062/1610600.3170.0000.3180.000 {built-in method getattr} > 191230.2590.0000.7890.000 elements.py:861(__init__) > 95670.2580.0000.2580.000 {method 'fetchone' of > 'pyodbc.Cursor' objects} > 76492/191230.2480.0002.8420.000 operators.py:297(__eq__) > 67012/240.2360.0002.6010.108 > visitors.py:75(_compiler_dispatch) > 191230.2340.0000.4240.000 elements.py:2927(__init__) > 95820.2130.0001.1040.000 elements.py:1860(_construct) > 191320.2050.0000.4350.000 compiler.py:676(visit_column) > 191230.1890.0001.8060.000 > compiler.py:1040(_generate_generic_binary) > 191230.1830.0001.2630.000 > default_comparator.py:290(_check_literal) > 57389/191430.1730.0002.7320.000 {built-in method eq} > 191230.1710.0001.9120.000 > default_comparator.py:22(_boolean_compare) > 286840.1690.0000.3460.000 > operators.py:1007(is_precedent) > 191260.1430.0000.3520.000 elements.py:3962(__new__) > 191230.1290.0000.7950.000 > compiler.py:1164(visit_bindparam) > 191230.1280.0000.9180.000 elements.py:3818(_bind_param) > 191530.1260.0000.5810.000 base.py:1128(decorate) > 10.1250.125 1084.925 1084.925 del_test.py:1() >1168450.1210.0000.1210.000 {method 'get' of 'dict' > objects} > 38264/95810.1150.0002.5670.000 compiler.py:806() > 191230.1130.0002.0480.000 compiler.py:1006(visit_binary) > 191250.1130.0000.1740.000 > compiler.py:1265(_process_anon) > 191250.1090.0000.4510.000 > compiler.py:1246(_truncated_identifier) > 191230.1080.0002.4570.000 annotation.py:100(__eq__) > 191230.1070.0002.1900.000 base.py:1306(visit_binary) > 191230.1040.0002.0150.000 type_api.py:60(operate) > 382830.1010.0000.1670.000 compiler.py:2901(quote) > 191230.0950.0000.5770.000 > compiler.py:1233(_truncate_bindparam) > 192060.0940.0000.1630.000 elements.py:3918(__new__) > 191250.0930.0000.3230.000 elements.py:4073(apply_map) > 382460.0910.0000.0910.000 elements.py:640(self_group) > 884600.0880.0000.0880.000 {method 'append' of 'list' > objects} > 191230.0840.0002.6210.000 properties.py:269(operate) > 38264/95810.0840.0002.5860.000 compiler.py:804() > 191230.0820.0000.7320.000 > annotation.py:78(_compiler_dispatch) > 12703/31420.0760.0002.6050.001 {method 'join' of 'str' > objects} > 191230.0750.0000.1290.000 > compiler.py:959(_get_operator_dispatch) > 10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete) > 191230.0700.0000.0890.000 > compiler.py:1271(bindparam_string) > 191230.0700.0002.0850.000 :1() > 95810.0670.0000.5200.000 elements.py:1883() > 95610.0640.0002.9050.000 db_base.py:208() > 23611/233230.0620.0000.0670.000 {built-in method hasattr} > 287050.0610.0000.1590.000 > elements.py:4216(_expression_literal_as_text) > 287460.0580.0000.0980.000 > elements.py:4220(_literal_as_text) > 191230.0580.0000.1300.000 > type_api.py:504(coerce_compared_value) > 95620.0560.0000.4160.000 result.py:1156(fetchone) > 191250.0560.0000.230
Re: [sqlalchemy] Is Textual SQL DB neutral
The generic version would look something like this: table = cls.__table__ pkconditions = [] for pk in cls.SQL_PK: pkconditions.append(table.c[pk] == sa.bindparam(pk) condition = sa.and_(*pkconditions) statement = sa.delete(table, whereclause=condition) batchparams = [dict(zip(cls.SQL_PK, v)) for v in id_batch] session.execute(statement, batchparams) As for the portability of the textual version: your original code produced fragments like this: field == 'value' a) Is the double-equals a typo, or is it syntax that SQL Server expects? I've never used double-equals in MySQL. b) Are the primary key fields always strings? You later said that they were, but if you ever had an integer primary key column, comparing it against a quoted value would be an error on postgres (I believe). c) Can the values that you are comparing against ever contain single quotes? That would break your query and potentially leave you open to SQL injection. d) Are your values unicode strings or byte strings? If unicode, does the encoding match what the database expects? If none of those are issues for you, the textual version is probably pretty safe. Simon On Wed, Aug 30, 2017 at 5:30 PM, Ken MacKenziewrote: > Not sure how I would iterate through a non predetermined number of primary > keys. > > I guess part of me is wondering that although textual sql is not inherently > db neutral how different between the db targets is the where field = 'value' > syntax? > > On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: >> >> You could also try using executemany: >> >> >> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements >> >> I think it would look something like this: >> >> table = cls.__table__ >> condition = sa.and_( >> table.c.pk1 == sa.bindparam('pk1'), >> table.c.pk2 == sa.bindparam('pk2'), >> ) >> statement = sa.delete(table, whereclause=condition) >> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] >> session.execute(statement, batchparams) >> >> Simon >> >> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie wrote: >> > After the current sorted profile finishes I will revert to the textual >> > version and run a profile on that. I expect another 10-15 minutes for >> > this >> > to finish right now. >> > >> > At present the batch size is set to 1000, total record count is just >> > over >> > 9000 in these tests. >> > >> > The reason for 1000 was at first I was looking at doing this as a >> > tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep >> > me >> > under most DB restrictions on the in statement. >> > >> > However since SQL Server does not seem to support the tuple_ usage I >> > reverted to this method. >> > >> > I technically have one more method and that is a concat_ in_ where I >> > concat >> > the fields. >> > >> > Other specifics, the table in question has 2 fields for the PK, both are >> > varchar, one length 3, the other length 10. There are 5 non key fields, >> > 3 >> > short varchars, one decimal at 14,2 precision and one varchar(800) which >> > contains description text. >> > >> > Total record count of the table before any deletion is about 1.05 >> > million. >> > >> > Python version is 3.4.5, running on a modest CentOS desktop and to be >> > fair >> > the SQL Server instance is sub optimal for development. >> > >> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: >> >> >> >> It would be interesting to see the profile of the textual SQL version. >> >> It looks like most of the time is being spent inside pyodbc, rather >> >> than SQLAlchemy, so I guess it must be something to do with the >> >> processing of bind parameters. How many parameters are being sent in >> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? >> >> >> >> You could try playing with your batch sizes to see what sort of effect >> >> that has. >> >> >> >> Simon >> > >> > -- >> > SQLAlchemy - >> > The Python SQL Toolkit and Object Relational Mapper >> > >> > http://www.sqlalchemy.org/ >> > >> > To post example code, please provide an MCVE: Minimal, Complete, and >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full >> > description. >> > --- >> > You received this message because you are subscribed to the Google >> > Groups >> > "sqlalchemy" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> > an >> > email to sqlalchemy+...@googlegroups.com. >> > To post to this group, send email to sqlal...@googlegroups.com. >> > Visit this group at https://groups.google.com/group/sqlalchemy. >> > For more options, visit https://groups.google.com/d/optout. > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received
Re: [sqlalchemy] Is Textual SQL DB neutral
When I get in tomorrow I will take a look at some of these options. I also intend to spin up a quick MySQL or postgresql db to test if my textual solution can work against an alternate target. Sent from my iPad > On Aug 30, 2017, at 6:20 PM, Jonathan Vanascowrote: > > Looking at your code and considering the size of your database, I've run into > performance issues under PostgreSQL a few times that may be related... > they're all usually because very slight text changes to a query can > drastically alter the server's query planner. > > The cause/fixes were: > > 1. Indexes often got ignored, ending up with lots of row scans. To fix, I > would wrap the "deletion" work with some sql to setup & tear down a new index > with all the columns present in the query. That usually tricked it into > using the new index. > 2. Too many fkey checks onto the table for deleted items. Deferring > constraint checks often helped. > 3. The position of the 'commit' ended up clearing the db's internal cache > each loop, so a reread of rows/indexes would happen each iteration. > Expanding the memory allocation for caching on the server helped handle that. > 4. Sometimes there was a mix of all the above and/or an issue with the size > of the query (so many values). What often helped was replacing the DELETE > with INSERTS of ids to delete into a new temporary table, then deleting from > that table. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
Looking at your code and considering the size of your database, I've run into performance issues under PostgreSQL a few times that may be related... they're all usually because very slight text changes to a query can drastically alter the server's query planner. The cause/fixes were: 1. Indexes often got ignored, ending up with lots of row scans. To fix, I would wrap the "deletion" work with some sql to setup & tear down a new index with all the columns present in the query. That usually tricked it into using the new index. 2. Too many fkey checks onto the table for deleted items. Deferring constraint checks often helped. 3. The position of the 'commit' ended up clearing the db's internal cache each loop, so a reread of rows/indexes would happen each iteration. Expanding the memory allocation for caching on the server helped handle that. 4. Sometimes there was a mix of all the above and/or an issue with the size of the query (so many values). What often helped was replacing the DELETE with INSERTS of ids to delete into a new temporary table, then deleting from that table. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
When you call create_engine: e = create_engine(url, echo=True) include the "echo=True" part. Then look at the SQL being emitted. That's the important part here. On Wed, Aug 30, 2017 at 5:09 PM, Ken MacKenziewrote: > So the big difference between the methods when seen with print(q) > > Textual SQL is building a statement with the values inline > > such as where college=888 and check=1234567 > > The methods using the column object are building with parameters > where college = (?) and check = (?) > > That is all I can see different. > > > On Wednesday, August 30, 2017 at 4:41:38 PM UTC-4, Ken MacKenzie wrote: >> >> I am noticing the 16 execute calls. There should only be 10 with a batch >> size of 1000 and a record size of 9,000 and some change. >> >> Are you just wanting to see the sql via a print(q)? >> >> On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote: >>> >>> it looks like you have 16 SQL statements that take a very different >>> amount of time depending on format. Can you please paste the output >>> with echo=True on create_engine() so that the difference in SQL >>> statements can be seen? > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
So the big difference between the methods when seen with print(q) Textual SQL is building a statement with the values inline such as where college=888 and check=1234567 The methods using the column object are building with parameters where college = (?) and check = (?) That is all I can see different. On Wednesday, August 30, 2017 at 4:41:38 PM UTC-4, Ken MacKenzie wrote: > > I am noticing the 16 execute calls. There should only be 10 with a batch > size of 1000 and a record size of 9,000 and some change. > > Are you just wanting to see the sql via a print(q)? > > On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote: >> >> it looks like you have 16 SQL statements that take a very different >> amount of time depending on format. Can you please paste the output >> with echo=True on create_engine() so that the difference in SQL >> statements can be seen? >> > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
With a batch size of 1 based on the original selection, no improvement. I am going to see what sql this is creating. 2328928 function calls (2085124 primitive calls) in 1084.925 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 26 1074.733 41.336 1074.733 41.336 {method 'execute' of 'pyodbc.Cursor' objects} 20.8710.4360.8720.436 {built-in method connect} 349128/3485230.5560.0000.5730.000 {built-in method isinstance} 260.5020.0190.5020.019 {method 'close' of 'pyodbc.Cursor' objects} 200.4850.0240.4850.024 {method 'commit' of 'pyodbc.Connection' objects} 161062/1610600.3170.0000.3180.000 {built-in method getattr} 191230.2590.0000.7890.000 elements.py:861(__init__) 95670.2580.0000.2580.000 {method 'fetchone' of 'pyodbc.Cursor' objects} 76492/191230.2480.0002.8420.000 operators.py:297(__eq__) 67012/240.2360.0002.6010.108 visitors.py:75(_compiler_dispatch) 191230.2340.0000.4240.000 elements.py:2927(__init__) 95820.2130.0001.1040.000 elements.py:1860(_construct) 191320.2050.0000.4350.000 compiler.py:676(visit_column) 191230.1890.0001.8060.000 compiler.py:1040(_generate_generic_binary) 191230.1830.0001.2630.000 default_comparator.py:290(_check_literal) 57389/191430.1730.0002.7320.000 {built-in method eq} 191230.1710.0001.9120.000 default_comparator.py:22(_boolean_compare) 286840.1690.0000.3460.000 operators.py:1007(is_precedent) 191260.1430.0000.3520.000 elements.py:3962(__new__) 191230.1290.0000.7950.000 compiler.py:1164(visit_bindparam) 191230.1280.0000.9180.000 elements.py:3818(_bind_param) 191530.1260.0000.5810.000 base.py:1128(decorate) 10.1250.125 1084.925 1084.925 del_test.py:1() 1168450.1210.0000.1210.000 {method 'get' of 'dict' objects} 38264/95810.1150.0002.5670.000 compiler.py:806() 191230.1130.0002.0480.000 compiler.py:1006(visit_binary) 191250.1130.0000.1740.000 compiler.py:1265(_process_anon) 191250.1090.0000.4510.000 compiler.py:1246(_truncated_identifier) 191230.1080.0002.4570.000 annotation.py:100(__eq__) 191230.1070.0002.1900.000 base.py:1306(visit_binary) 191230.1040.0002.0150.000 type_api.py:60(operate) 382830.1010.0000.1670.000 compiler.py:2901(quote) 191230.0950.0000.5770.000 compiler.py:1233(_truncate_bindparam) 192060.0940.0000.1630.000 elements.py:3918(__new__) 191250.0930.0000.3230.000 elements.py:4073(apply_map) 382460.0910.0000.0910.000 elements.py:640(self_group) 884600.0880.0000.0880.000 {method 'append' of 'list' objects} 191230.0840.0002.6210.000 properties.py:269(operate) 38264/95810.0840.0002.5860.000 compiler.py:804() 191230.0820.0000.7320.000 annotation.py:78(_compiler_dispatch) 12703/31420.0760.0002.6050.001 {method 'join' of 'str' objects} 191230.0750.0000.1290.000 compiler.py:959(_get_operator_dispatch) 10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete) 191230.0700.0000.0890.000 compiler.py:1271(bindparam_string) 191230.0700.0002.0850.000 :1() 95810.0670.0000.5200.000 elements.py:1883() 95610.0640.0002.9050.000 db_base.py:208() 23611/233230.0620.0000.0670.000 {built-in method hasattr} 287050.0610.0000.1590.000 elements.py:4216(_expression_literal_as_text) 287460.0580.0000.0980.000 elements.py:4220(_literal_as_text) 191230.0580.0000.1300.000 type_api.py:504(coerce_compared_value) 95620.0560.0000.4160.000 result.py:1156(fetchone) 191250.0560.0000.2300.000 _collections.py:728(__missing__) 191230.0540.0000.2960.000 elements.py:2986(self_group) -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email
Re: [sqlalchemy] Is Textual SQL DB neutral
I am noticing the 16 execute calls. There should only be 10 with a batch size of 1000 and a record size of 9,000 and some change. Are you just wanting to see the sql via a print(q)? On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote: > > it looks like you have 16 SQL statements that take a very different > amount of time depending on format. Can you please paste the output > with echo=True on create_engine() so that the difference in SQL > statements can be seen? > > On Wed, Aug 30, 2017 at 12:30 PM, Ken MacKenzie> wrote: > > Not sure how I would iterate through a non predetermined number of > primary > > keys. > > > > I guess part of me is wondering that although textual sql is not > inherently > > db neutral how different between the db targets is the where field = > 'value' > > syntax? > > > > On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: > >> > >> You could also try using executemany: > >> > >> > >> > http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements > > >> > >> I think it would look something like this: > >> > >> table = cls.__table__ > >> condition = sa.and_( > >> table.c.pk1 == sa.bindparam('pk1'), > >> table.c.pk2 == sa.bindparam('pk2'), > >> ) > >> statement = sa.delete(table, whereclause=condition) > >> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] > >> session.execute(statement, batchparams) > >> > >> Simon > >> > >> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie > wrote: > >> > After the current sorted profile finishes I will revert to the > textual > >> > version and run a profile on that. I expect another 10-15 minutes > for > >> > this > >> > to finish right now. > >> > > >> > At present the batch size is set to 1000, total record count is just > >> > over > >> > 9000 in these tests. > >> > > >> > The reason for 1000 was at first I was looking at doing this as a > >> > tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should > keep > >> > me > >> > under most DB restrictions on the in statement. > >> > > >> > However since SQL Server does not seem to support the tuple_ usage I > >> > reverted to this method. > >> > > >> > I technically have one more method and that is a concat_ in_ where I > >> > concat > >> > the fields. > >> > > >> > Other specifics, the table in question has 2 fields for the PK, both > are > >> > varchar, one length 3, the other length 10. There are 5 non key > fields, > >> > 3 > >> > short varchars, one decimal at 14,2 precision and one varchar(800) > which > >> > contains description text. > >> > > >> > Total record count of the table before any deletion is about 1.05 > >> > million. > >> > > >> > Python version is 3.4.5, running on a modest CentOS desktop and to be > >> > fair > >> > the SQL Server instance is sub optimal for development. > >> > > >> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > >> >> > >> >> It would be interesting to see the profile of the textual SQL > version. > >> >> It looks like most of the time is being spent inside pyodbc, rather > >> >> than SQLAlchemy, so I guess it must be something to do with the > >> >> processing of bind parameters. How many parameters are being sent in > >> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? > >> >> > >> >> You could try playing with your batch sizes to see what sort of > effect > >> >> that has. > >> >> > >> >> Simon > >> > > >> > -- > >> > SQLAlchemy - > >> > The Python SQL Toolkit and Object Relational Mapper > >> > > >> > http://www.sqlalchemy.org/ > >> > > >> > To post example code, please provide an MCVE: Minimal, Complete, and > >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a > full > >> > description. > >> > --- > >> > You received this message because you are subscribed to the Google > >> > Groups > >> > "sqlalchemy" group. > >> > To unsubscribe from this group and stop receiving emails from it, > send > >> > an > >> > email to sqlalchemy+...@googlegroups.com. > >> > To post to this group, send email to sqlal...@googlegroups.com. > >> > Visit this group at https://groups.google.com/group/sqlalchemy. > >> > For more options, visit https://groups.google.com/d/optout. > > > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > > description. > > --- > > You received this message because you are subscribed to the Google > Groups > > "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to sqlalchemy+...@googlegroups.com . > > To post to this group, send email to sqlal...@googlegroups.com
Re: [sqlalchemy] Is Textual SQL DB neutral
I tried an alternate version: @classmethod def dict_delete2(cls, dict_in): """ Delete records from the database based on a dictionary keyed by PK tuple """ s = dbc.get_session() keys_in = ["-".join(k) for k in dict_in.keys()] batch_size = 1000 cols = [getattr(cls, colname) for colname in cls.SQL_PK] cols_dash = [] for col in cols: cols_dash.append(col) cols_dash.append('-') cols_dash = cols_dash[:-1] print(cols_dash) while len(keys_in): id_batch = keys_in[:batch_size] # print(id_batch) del keys_in[:batch_size] q = s.query(cls) q = q.filter(func.concat(*cols_dash).in_(id_batch)).delete(synchronize_session=False) s.commit() s.close() Performance was even worse? 673960 function calls (659542 primitive calls) in 2315.534 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 16 2312.810 144.551 2312.810 144.551 {method 'execute' of 'pyodbc.Cursor' objects} 95670.1860.0000.1860.000 {method 'fetchone' of 'pyodbc.Cursor' objects} 95720.1660.0000.4330.000 elements.py:861(__init__) 10.1190.119 2315.536 2315.536 del_test.py:1() 62442/618370.1010.0000.1180.000 {built-in method isinstance} 95720.0680.0000.4070.000 compiler.py:1164(visit_bindparam) 95750.0620.0000.1710.000 elements.py:3962(__new__) 95610.0600.0000.4920.000 functions.py:436(_bind_param) 95740.0590.0000.0900.000 compiler.py:1265(_process_anon) 95740.0540.0000.2270.000 compiler.py:1246(_truncated_identifier) 100.0520.0050.7380.074 default_comparator.py:110(_in_impl) 95620.0510.0000.3190.000 result.py:1156(fetchone) 96550.0490.0000.0840.000 elements.py:3918(__new__) 95720.0470.0000.2910.000 compiler.py:1233(_truncate_bindparam) 20.0470.0230.0480.024 {built-in method connect} 95740.0470.0000.1630.000 elements.py:4073(apply_map) 100.0420.0040.0420.004 {method 'commit' of 'pyodbc.Connection' objects} 406490.0400.0000.0400.000 {method 'append' of 'list' objects} 95720.0390.0000.0490.000 compiler.py:1271(bindparam_string) 9696/140.0390.0000.5210.037 visitors.py:75(_compiler_dispatch) 96110.0390.0000.4850.000 compiler.py:806() 14090/138020.0370.0000.0420.000 {built-in method hasattr} 17697/176950.0370.0000.0380.000 {built-in method getattr} 95620.0370.0000.0580.000 result.py:1085(process_rows) 95610.0360.0000.0800.000 elements.py:4254(_is_literal) 286830.0360.0000.0360.000 del_test.py:14() 95620.0300.0000.0680.000 type_api.py:504(coerce_compared_value) 220.0290.0010.1030.005 elements.py:1784() 127130.0270.0000.5360.000 {method 'join' of 'str' objects} 95720.0260.0000.0380.000 sqltypes.py:2608(_resolve_value_to_type) 160.0260.0020.0260.002 {method 'close' of 'pyodbc.Cursor' objects} 95740.0260.0000.1160.000 _collections.py:728(__missing__) 95840.0260.0000.0670.000 compiler.py:494() 213250.0250.0000.0250.000 {method 'get' of 'dict' objects} 1250.0250.0000.0250.000 {built-in method loads} 750/7240.0250.0000.3380.000 {built-in method __build_class__} 95720.0250.0000.0410.000 type_api.py:452(_cached_bind_processor) 95620.0240.0000.3430.000 result.py:868(__iter__) 95640.0220.0000.2090.000 result.py:1053(_fetchone_impl) 96030.0220.0000.0220.000 elements.py:640(self_group) 102060.0210.0000.0210.000 {method 'split' of 'str' objects} 95620.0210.0000.0210.000 result.py:1098() 96110.0210.0000.5060.000 compiler.py:804() 104210.0200.0000.0200.000 {built-in method __new__ of type object at 0x7fc6ee607e40} 96030.0200.0000.0530.000 elements.py:4216(_expression_literal_as_text) 96240.0190.0000.0330.000 elements.py:4220(_literal_as_text) 18953/188380.0190.0000.0190.000 {built-in method len} 102090.0160.0000.0160.000 weakref.py:364(__getitem__) 290/10.0160.000 2315.538 2315.538 {built-in method exec} 680.0160.0000.0420.001 inspect.py:264(getmembers) 3610.0140.000
Re: [sqlalchemy] Is Textual SQL DB neutral
it looks like you have 16 SQL statements that take a very different amount of time depending on format. Can you please paste the output with echo=True on create_engine() so that the difference in SQL statements can be seen? On Wed, Aug 30, 2017 at 12:30 PM, Ken MacKenziewrote: > Not sure how I would iterate through a non predetermined number of primary > keys. > > I guess part of me is wondering that although textual sql is not inherently > db neutral how different between the db targets is the where field = 'value' > syntax? > > On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: >> >> You could also try using executemany: >> >> >> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements >> >> I think it would look something like this: >> >> table = cls.__table__ >> condition = sa.and_( >> table.c.pk1 == sa.bindparam('pk1'), >> table.c.pk2 == sa.bindparam('pk2'), >> ) >> statement = sa.delete(table, whereclause=condition) >> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] >> session.execute(statement, batchparams) >> >> Simon >> >> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie wrote: >> > After the current sorted profile finishes I will revert to the textual >> > version and run a profile on that. I expect another 10-15 minutes for >> > this >> > to finish right now. >> > >> > At present the batch size is set to 1000, total record count is just >> > over >> > 9000 in these tests. >> > >> > The reason for 1000 was at first I was looking at doing this as a >> > tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep >> > me >> > under most DB restrictions on the in statement. >> > >> > However since SQL Server does not seem to support the tuple_ usage I >> > reverted to this method. >> > >> > I technically have one more method and that is a concat_ in_ where I >> > concat >> > the fields. >> > >> > Other specifics, the table in question has 2 fields for the PK, both are >> > varchar, one length 3, the other length 10. There are 5 non key fields, >> > 3 >> > short varchars, one decimal at 14,2 precision and one varchar(800) which >> > contains description text. >> > >> > Total record count of the table before any deletion is about 1.05 >> > million. >> > >> > Python version is 3.4.5, running on a modest CentOS desktop and to be >> > fair >> > the SQL Server instance is sub optimal for development. >> > >> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: >> >> >> >> It would be interesting to see the profile of the textual SQL version. >> >> It looks like most of the time is being spent inside pyodbc, rather >> >> than SQLAlchemy, so I guess it must be something to do with the >> >> processing of bind parameters. How many parameters are being sent in >> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? >> >> >> >> You could try playing with your batch sizes to see what sort of effect >> >> that has. >> >> >> >> Simon >> > >> > -- >> > SQLAlchemy - >> > The Python SQL Toolkit and Object Relational Mapper >> > >> > http://www.sqlalchemy.org/ >> > >> > To post example code, please provide an MCVE: Minimal, Complete, and >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full >> > description. >> > --- >> > You received this message because you are subscribed to the Google >> > Groups >> > "sqlalchemy" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> > an >> > email to sqlalchemy+...@googlegroups.com. >> > To post to this group, send email to sqlal...@googlegroups.com. >> > Visit this group at https://groups.google.com/group/sqlalchemy. >> > For more options, visit https://groups.google.com/d/optout. > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to
Re: [sqlalchemy] Is Textual SQL DB neutral
Not sure how I would iterate through a non predetermined number of primary keys. I guess part of me is wondering that although textual sql is not inherently db neutral how different between the db targets is the where field = 'value' syntax? On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: > > You could also try using executemany: > > > http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements > > > I think it would look something like this: > > table = cls.__table__ > condition = sa.and_( > table.c.pk1 == sa.bindparam('pk1'), > table.c.pk2 == sa.bindparam('pk2'), > ) > statement = sa.delete(table, whereclause=condition) > batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] > session.execute(statement, batchparams) > > Simon > > On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie> wrote: > > After the current sorted profile finishes I will revert to the textual > > version and run a profile on that. I expect another 10-15 minutes for > this > > to finish right now. > > > > At present the batch size is set to 1000, total record count is just > over > > 9000 in these tests. > > > > The reason for 1000 was at first I was looking at doing this as a > > tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep > me > > under most DB restrictions on the in statement. > > > > However since SQL Server does not seem to support the tuple_ usage I > > reverted to this method. > > > > I technically have one more method and that is a concat_ in_ where I > concat > > the fields. > > > > Other specifics, the table in question has 2 fields for the PK, both are > > varchar, one length 3, the other length 10. There are 5 non key fields, > 3 > > short varchars, one decimal at 14,2 precision and one varchar(800) which > > contains description text. > > > > Total record count of the table before any deletion is about 1.05 > million. > > > > Python version is 3.4.5, running on a modest CentOS desktop and to be > fair > > the SQL Server instance is sub optimal for development. > > > > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > >> > >> It would be interesting to see the profile of the textual SQL version. > >> It looks like most of the time is being spent inside pyodbc, rather > >> than SQLAlchemy, so I guess it must be something to do with the > >> processing of bind parameters. How many parameters are being sent in > >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? > >> > >> You could try playing with your batch sizes to see what sort of effect > >> that has. > >> > >> Simon > > > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > > description. > > --- > > You received this message because you are subscribed to the Google > Groups > > "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to sqlalchemy+...@googlegroups.com . > > To post to this group, send email to sqlal...@googlegroups.com > . > > Visit this group at https://groups.google.com/group/sqlalchemy. > > For more options, visit https://groups.google.com/d/optout. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
You could also try using executemany: http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements I think it would look something like this: table = cls.__table__ condition = sa.and_( table.c.pk1 == sa.bindparam('pk1'), table.c.pk2 == sa.bindparam('pk2'), ) statement = sa.delete(table, whereclause=condition) batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] session.execute(statement, batchparams) Simon On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenziewrote: > After the current sorted profile finishes I will revert to the textual > version and run a profile on that. I expect another 10-15 minutes for this > to finish right now. > > At present the batch size is set to 1000, total record count is just over > 9000 in these tests. > > The reason for 1000 was at first I was looking at doing this as a > tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep me > under most DB restrictions on the in statement. > > However since SQL Server does not seem to support the tuple_ usage I > reverted to this method. > > I technically have one more method and that is a concat_ in_ where I concat > the fields. > > Other specifics, the table in question has 2 fields for the PK, both are > varchar, one length 3, the other length 10. There are 5 non key fields, 3 > short varchars, one decimal at 14,2 precision and one varchar(800) which > contains description text. > > Total record count of the table before any deletion is about 1.05 million. > > Python version is 3.4.5, running on a modest CentOS desktop and to be fair > the SQL Server instance is sub optimal for development. > > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: >> >> It would be interesting to see the profile of the textual SQL version. >> It looks like most of the time is being spent inside pyodbc, rather >> than SQLAlchemy, so I guess it must be something to do with the >> processing of bind parameters. How many parameters are being sent in >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? >> >> You could try playing with your batch sizes to see what sort of effect >> that has. >> >> Simon > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
Sorted Textual SQL profile ncalls tottime percall cumtime percall filename:lineno(function) 161.8400.1151.8400.115 {method 'execute' of 'pyodbc.Cursor' objects} 95720.1960.0000.7880.000 elements.py:1860(_construct) 577000.1890.0000.1890.000 {method 'sub' of '_sre.SRE_Pattern' objects} 95670.1850.0000.1850.000 {method 'fetchone' of 'pyodbc.Cursor' objects} 100329/997240.1250.0000.1420.000 {built-in method isinstance} 191230.1200.0000.2000.000 elements.py:1225(__init__) 10.1170.1175.0425.042 del_test.py:1() 191230.1000.0000.2220.000 compiler.py:737(visit_textclause) 38254/95710.0990.0000.5660.000 compiler.py:806() 28738/140.0960.0000.5980.043 visitors.py:75(_compiler_dispatch) 191230.0860.0000.2860.000 elements.py:1240(_create_text) 38254/95710.0790.0000.5850.000 compiler.py:804() 12693/31320.0680.0000.6030.000 {method 'join' of 'str' objects} 10.0650.0653.8113.811 db_base.py:138(dict_delete) 95610.0620.0000.3980.000 db_base.py:178() 95710.0600.0000.2400.000 elements.py:1883() 286950.0600.0000.1560.000 elements.py:4216(_expression_literal_as_text) 287160.0580.0000.0960.000 elements.py:4220(_literal_as_text) 95620.0520.0000.3170.000 result.py:1156(fetchone) 95620.0510.0000.1050.000 operators.py:1007(is_precedent) 501540.0500.0000.0500.000 {method 'append' of 'list' objects} 191230.0500.0000.3360.000 :1(text) 95720.0480.0000.2030.000 elements.py:1865() 27063/270610.0470.0000.0480.000 {built-in method getattr} 20.0440.0220.0450.022 {built-in method connect} 9571/100.0370.0000.5950.059 compiler.py:797(visit_clauselist) 95620.0370.0000.5340.000 elements.py:1894(and_) 286830.0360.0000.0360.000 del_test.py:14() 95620.0360.0000.0570.000 result.py:1085(process_rows) 95610.0290.0000.1590.000 elements.py:1971(self_group) 28504/283890.0280.0000.0280.000 {built-in method len} 1250.0260.0000.0260.000 {built-in method loads} 750/7240.0260.0000.3400.000 {built-in method __build_class__} 95610.0250.0000.5600.000 :1(and_) 95610.0240.0000.1300.000 elements.py:1818(self_group) 95620.0240.0000.3410.000 result.py:868(__iter__) 95640.0220.0000.2070.000 result.py:1053(_fetchone_impl) 95720.0220.0000.0320.000 _collections.py:788(coerce_generator_arg) 95620.0210.0000.0210.000 result.py:1098() 191220.0210.0000.0210.000 elements.py:1601(self_group) 212320.0210.0000.0210.000 {method 'get' of 'dict' objects} 100.0190.0020.0190.002 {method 'commit' of 'pyodbc.Connection' objects} 191230.0190.0000.0190.000 compiler.py:734(post_process_text) 290/10.0160.0005.0445.044 {built-in method exec} 680.0150.0000.0420.001 inspect.py:264(getmembers) 3610.0140.0000.1030.000 inspect.py:943(getfullargspec) 28840.0140.0000.0200.000 sre_parse.py:197(__next) 104200.0130.0000.0130.000 {built-in method __new__ of type object at 0x7f9abdca4e40} 95610.0120.0000.0120.000 result.py:168(keys) 3610.0120.0000.0650.000 inspect.py:2383(from_function) 7140.0100.0000.0180.000 inspect.py:1056(formatargspec) 2960.0100.0000.0410.000 :2016(find_spec) 4486/41980.0100.0000.0150.000 {built-in method hasattr} 130/380.0100.0000.0440.001 sre_parse.py:448(_parse) -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
Sorted: ncalls tottime percall cumtime percall filename:lineno(function) 16 1053.794 65.862 1053.794 65.862 {method 'execute' of 'pyodbc.Cursor' objects} 348916/3483110.5390.0000.5560.000 {built-in method isinstance} 160920/1609180.3130.0000.3130.000 {built-in method getattr} 191230.2820.0000.4710.000 elements.py:2927(__init__) 66982/140.2340.0002.5610.183 visitors.py:75(_compiler_dispatch) 191230.2310.0000.7020.000 elements.py:861(__init__) 76492/191230.2280.0002.7400.000 operators.py:297(__eq__) 95720.2130.0001.0920.000 elements.py:1860(_construct) 191230.2010.0001.8780.000 default_comparator.py:22(_boolean_compare) 191320.1940.0000.4240.000 compiler.py:676(visit_column) 95670.1900.0000.1900.000 {method 'fetchone' of 'pyodbc.Cursor' objects} 191230.1830.0001.7610.000 compiler.py:1040(_generate_generic_binary) 191230.1800.0001.1550.000 default_comparator.py:290(_check_literal) 286840.1710.0000.3470.000 operators.py:1007(is_precedent) 57379/191330.1570.0002.6340.000 {built-in method eq} 191230.1330.0000.7860.000 compiler.py:1164(visit_bindparam) 191430.1230.0000.5660.000 base.py:1128(decorate) 1168350.1200.0000.1200.000 {method 'get' of 'dict' objects} 10.1170.117 1061.906 1061.906 del_test.py:1() 191230.1160.0002.0050.000 compiler.py:1006(visit_binary) 191230.1140.0000.8160.000 elements.py:3818(_bind_param) 191230.1130.0002.1540.000 base.py:1306(visit_binary) 191250.1110.0000.1730.000 compiler.py:1265(_process_anon) 38254/95710.1110.0002.5280.000 compiler.py:806() 382730.1020.0000.1670.000 compiler.py:2901(quote) 191230.0990.0001.9760.000 type_api.py:60(operate) 191230.0980.0000.5650.000 compiler.py:1233(_truncate_bindparam) 191250.0970.0000.4380.000 compiler.py:1246(_truncated_identifier) 191260.0970.0000.2950.000 elements.py:3962(__new__) 382460.0930.0000.0930.000 elements.py:640(self_group) 191230.0920.0002.3760.000 annotation.py:100(__eq__) 191250.0920.0000.3210.000 elements.py:4073(apply_map) 192060.0890.0000.1550.000 elements.py:3918(__new__) 883980.0870.0000.0870.000 {method 'append' of 'list' objects} 38254/95710.0840.0002.5470.000 compiler.py:804() 191230.0820.0002.5260.000 properties.py:269(operate) 191230.0750.0000.1280.000 compiler.py:959(_get_operator_dispatch) 12693/31320.0730.0002.5660.001 {method 'join' of 'str' objects} 191230.0690.0000.7040.000 annotation.py:78(_compiler_dispatch) 191230.0680.0000.0870.000 compiler.py:1271(bindparam_string) 10.0680.068 1060.665 1060.665 db_base.py:138(dict_delete) 95710.0650.0000.5200.000 elements.py:1883() 191230.0650.0002.0410.000 :1() 95610.0620.0002.8020.000 db_base.py:174() 23610/233220.0620.0000.0680.000 {built-in method hasattr} 286950.0610.0000.1560.000 elements.py:4216(_expression_literal_as_text) 191230.0590.0000.1290.000 type_api.py:504(coerce_compared_value) 287160.0570.0000.0950.000 elements.py:4220(_literal_as_text) 191250.0570.0000.2290.000 _collections.py:728(__missing__) 191230.0530.0000.2940.000 elements.py:2986(self_group) 95620.0530.0000.3250.000 result.py:1156(fetchone) 191350.0530.0000.1340.000 compiler.py:494() 191230.0500.0000.0700.000 sqltypes.py:2608(_resolve_value_to_type) 191230.0500.0000.0820.000 type_api.py:452(_cached_bind_processor) -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
After the current sorted profile finishes I will revert to the textual version and run a profile on that. I expect another 10-15 minutes for this to finish right now. At present the batch size is set to 1000, total record count is just over 9000 in these tests. The reason for 1000 was at first I was looking at doing this as a tuple_(fld, fld).in_((val, val),(val,val)) format. The 1000 should keep me under most DB restrictions on the in statement. However since SQL Server does not seem to support the tuple_ usage I reverted to this method. I technically have one more method and that is a concat_ in_ where I concat the fields. Other specifics, the table in question has 2 fields for the PK, both are varchar, one length 3, the other length 10. There are 5 non key fields, 3 short varchars, one decimal at 14,2 precision and one varchar(800) which contains description text. Total record count of the table before any deletion is about 1.05 million. Python version is 3.4.5, running on a modest CentOS desktop and to be fair the SQL Server instance is sub optimal for development. On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > > It would be interesting to see the profile of the textual SQL version. > It looks like most of the time is being spent inside pyodbc, rather > than SQLAlchemy, so I guess it must be something to do with the > processing of bind parameters. How many parameters are being sent in > per query? ie. what is len(id_batch) * len(cls.SQL_PK)? > > You could try playing with your batch sizes to see what sort of effect > that has. > > Simon > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Is Textual SQL DB neutral
After I finish the current profile run I will revert and profile that one for comparison. I am expecting another 10-15 minutes for this to finish. On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > > It would be interesting to see the profile of the textual SQL version. > It looks like most of the time is being spent inside pyodbc, rather > than SQLAlchemy, so I guess it must be something to do with the > processing of bind parameters. How many parameters are being sent in > per query? ie. what is len(id_batch) * len(cls.SQL_PK)? > > You could try playing with your batch sizes to see what sort of effect > that has. > > Simon > > On Wed, Aug 30, 2017 at 4:01 PM, Ken MacKenzie> wrote: > > So I implemented this version with one change. I moved this line: > > > > cols = [getattr(cls, colname) for colname in cls.SQL_PK] > > > > To the beginning before the loop, actually before the outer loop that > parses > > the batch. > > > > However, the new version is MUCH slower than the original, so I profiled > it > > and this is what I get: > > > > %prun import db_api_lib.del_test > > 1421296 function calls (1277536 primitive calls) in 620.434 > seconds > > > >Ordered by: internal time > > > >ncalls tottime percall cumtime percall filename:lineno(function) > >12 615.512 51.293 615.512 51.293 {method 'execute' of > > 'pyodbc.Cursor' objects} > > 208876/2082710.3160.0000.3320.000 {built-in method > > isinstance} > > 96876/968740.1840.0000.1850.000 {built-in method > getattr} > > 111230.1370.0000.2420.000 elements.py:2927(__init__) > > 38970/100.1360.0001.5020.150 > > visitors.py:75(_compiler_dispatch) > > 111230.1280.0000.3970.000 elements.py:861(__init__) > > 44492/111230.1270.0001.5040.000 operators.py:297(__eq__) > > 55680.1160.0000.6170.000 > elements.py:1860(_construct) > > 111320.1130.0000.2460.000 > compiler.py:676(visit_column) > > 111230.1110.0001.0390.000 > > compiler.py:1040(_generate_generic_binary) > > 55670.1080.0000.1080.000 {method 'fetchone' of > > 'pyodbc.Cursor' objects} > > 111230.0990.0000.6460.000 > > default_comparator.py:290(_check_literal) > > 166840.0950.0000.1950.000 > > operators.py:1007(is_precedent) > > 111230.0940.0001.0100.000 > > default_comparator.py:22(_boolean_compare) > > 33375/111290.0900.0001.4450.000 {built-in method eq} > > 111230.0750.0000.4650.000 > > compiler.py:1164(visit_bindparam) > > 10.0750.075 619.467 619.467 > db_base.py:138(dict_delete) > > 111390.0730.0000.3310.000 base.py:1128(decorate) > > 688310.0700.0000.0700.000 {method 'get' of 'dict' > > objects} > > 10.0680.068 620.471 620.471 del_test.py:1() > > 22250/55670.0670.0001.4820.000 > compiler.py:806() > > 111250.0660.0000.1020.000 > > compiler.py:1265(_process_anon) > > 111230.0650.0001.1790.000 > compiler.py:1006(visit_binary) > > 111230.0630.0001.2630.000 base.py:1306(visit_binary) > > 111230.0610.0000.4580.000 > elements.py:3818(_bind_param) > > 222690.0580.0000.0970.000 compiler.py:2901(quote) > > 111250.0580.0000.2610.000 > > compiler.py:1246(_truncated_identifier) > > 111230.0580.0001.2990.000 annotation.py:100(__eq__) > > 111230.0560.0001.0660.000 type_api.py:60(operate) > > 563860.0550.0000.0550.000 {method 'append' of 'list' > > objects} > > 111260.0550.0000.1680.000 elements.py:3962(__new__) > > 111230.0550.0000.3340.000 > > compiler.py:1233(_truncate_bindparam) > > 111250.0540.0000.1920.000 > elements.py:4073(apply_map) > > 112060.0510.0000.0890.000 elements.py:3918(__new__) > > 222460.0510.0000.0510.000 > elements.py:640(self_group) > > 22250/55670.0490.0001.4930.000 > compiler.py:804() > > 20.0480.0240.0490.024 {built-in method connect} > > 111230.0470.0001.3860.000 properties.py:269(operate) > > 8689/31280.0450.0001.5080.000 {method 'join' of 'str' > > objects} > > 111230.0450.0000.0760.000 > > compiler.py:959(_get_operator_dispatch) > > 111230.0450.0000.0560.000 > > compiler.py:1271(bindparam_string) > > 111230.0400.0000.4110.000 > > annotation.py:78(_compiler_dispatch) > > 15610/153220.0380.0000.0430.000 {built-in method >
Re: [sqlalchemy] Is Textual SQL DB neutral
It would be interesting to see the profile of the textual SQL version. It looks like most of the time is being spent inside pyodbc, rather than SQLAlchemy, so I guess it must be something to do with the processing of bind parameters. How many parameters are being sent in per query? ie. what is len(id_batch) * len(cls.SQL_PK)? You could try playing with your batch sizes to see what sort of effect that has. Simon On Wed, Aug 30, 2017 at 4:01 PM, Ken MacKenziewrote: > So I implemented this version with one change. I moved this line: > > cols = [getattr(cls, colname) for colname in cls.SQL_PK] > > To the beginning before the loop, actually before the outer loop that parses > the batch. > > However, the new version is MUCH slower than the original, so I profiled it > and this is what I get: > > %prun import db_api_lib.del_test > 1421296 function calls (1277536 primitive calls) in 620.434 seconds > >Ordered by: internal time > >ncalls tottime percall cumtime percall filename:lineno(function) >12 615.512 51.293 615.512 51.293 {method 'execute' of > 'pyodbc.Cursor' objects} > 208876/2082710.3160.0000.3320.000 {built-in method > isinstance} > 96876/968740.1840.0000.1850.000 {built-in method getattr} > 111230.1370.0000.2420.000 elements.py:2927(__init__) > 38970/100.1360.0001.5020.150 > visitors.py:75(_compiler_dispatch) > 111230.1280.0000.3970.000 elements.py:861(__init__) > 44492/111230.1270.0001.5040.000 operators.py:297(__eq__) > 55680.1160.0000.6170.000 elements.py:1860(_construct) > 111320.1130.0000.2460.000 compiler.py:676(visit_column) > 111230.1110.0001.0390.000 > compiler.py:1040(_generate_generic_binary) > 55670.1080.0000.1080.000 {method 'fetchone' of > 'pyodbc.Cursor' objects} > 111230.0990.0000.6460.000 > default_comparator.py:290(_check_literal) > 166840.0950.0000.1950.000 > operators.py:1007(is_precedent) > 111230.0940.0001.0100.000 > default_comparator.py:22(_boolean_compare) > 33375/111290.0900.0001.4450.000 {built-in method eq} > 111230.0750.0000.4650.000 > compiler.py:1164(visit_bindparam) > 10.0750.075 619.467 619.467 db_base.py:138(dict_delete) > 111390.0730.0000.3310.000 base.py:1128(decorate) > 688310.0700.0000.0700.000 {method 'get' of 'dict' > objects} > 10.0680.068 620.471 620.471 del_test.py:1() > 22250/55670.0670.0001.4820.000 compiler.py:806() > 111250.0660.0000.1020.000 > compiler.py:1265(_process_anon) > 111230.0650.0001.1790.000 compiler.py:1006(visit_binary) > 111230.0630.0001.2630.000 base.py:1306(visit_binary) > 111230.0610.0000.4580.000 elements.py:3818(_bind_param) > 222690.0580.0000.0970.000 compiler.py:2901(quote) > 111250.0580.0000.2610.000 > compiler.py:1246(_truncated_identifier) > 111230.0580.0001.2990.000 annotation.py:100(__eq__) > 111230.0560.0001.0660.000 type_api.py:60(operate) > 563860.0550.0000.0550.000 {method 'append' of 'list' > objects} > 111260.0550.0000.1680.000 elements.py:3962(__new__) > 111230.0550.0000.3340.000 > compiler.py:1233(_truncate_bindparam) > 111250.0540.0000.1920.000 elements.py:4073(apply_map) > 112060.0510.0000.0890.000 elements.py:3918(__new__) > 222460.0510.0000.0510.000 elements.py:640(self_group) > 22250/55670.0490.0001.4930.000 compiler.py:804() > 20.0480.0240.0490.024 {built-in method connect} > 111230.0470.0001.3860.000 properties.py:269(operate) > 8689/31280.0450.0001.5080.000 {method 'join' of 'str' > objects} > 111230.0450.0000.0760.000 > compiler.py:959(_get_operator_dispatch) > 111230.0450.0000.0560.000 > compiler.py:1271(bindparam_string) > 111230.0400.0000.4110.000 > annotation.py:78(_compiler_dispatch) > 15610/153220.0380.0000.0430.000 {built-in method hasattr} > 55670.0380.0000.2940.000 elements.py:1883() > 111250.0360.0000.1380.000 > _collections.py:728(__missing__) > 166910.0350.0000.0900.000 > elements.py:4216(_expression_literal_as_text) > 55610.0350.0001.5390.000 db_base.py:174() > 167040.0330.0000.0550.000 > elements.py:4220(_literal_as_text) > 111230.0330.0000.0730.000 > type_api.py:504(coerce_compared_value) > 31584/314690.0310.000
Re: [sqlalchemy] Is Textual SQL DB neutral
So I implemented this version with one change. I moved this line: cols = [getattr(cls, colname) for colname in cls.SQL_PK] To the beginning before the loop, actually before the outer loop that parses the batch. However, the new version is MUCH slower than the original, so I profiled it and this is what I get: %prun import db_api_lib.del_test 1421296 function calls (1277536 primitive calls) in 620.434 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 12 615.512 51.293 615.512 51.293 {method 'execute' of 'pyodbc.Cursor' objects} 208876/2082710.3160.0000.3320.000 {built-in method isinstance} 96876/968740.1840.0000.1850.000 {built-in method getattr} 111230.1370.0000.2420.000 elements.py:2927(__init__) 38970/100.1360.0001.5020.150 visitors.py:75(_compiler_dispatch) 111230.1280.0000.3970.000 elements.py:861(__init__) 44492/111230.1270.0001.5040.000 operators.py:297(__eq__) 55680.1160.0000.6170.000 elements.py:1860(_construct) 111320.1130.0000.2460.000 compiler.py:676(visit_column) 111230.1110.0001.0390.000 compiler.py:1040(_generate_generic_binary) 55670.1080.0000.1080.000 {method 'fetchone' of 'pyodbc.Cursor' objects} 111230.0990.0000.6460.000 default_comparator.py:290(_check_literal) 166840.0950.0000.1950.000 operators.py:1007(is_precedent) 111230.0940.0001.0100.000 default_comparator.py:22(_boolean_compare) 33375/111290.0900.0001.4450.000 {built-in method eq} 111230.0750.0000.4650.000 compiler.py:1164(visit_bindparam) 10.0750.075 619.467 619.467 db_base.py:138(dict_delete) 111390.0730.0000.3310.000 base.py:1128(decorate) 688310.0700.0000.0700.000 {method 'get' of 'dict' objects} 10.0680.068 620.471 620.471 del_test.py:1() 22250/55670.0670.0001.4820.000 compiler.py:806() 111250.0660.0000.1020.000 compiler.py:1265(_process_anon) 111230.0650.0001.1790.000 compiler.py:1006(visit_binary) 111230.0630.0001.2630.000 base.py:1306(visit_binary) 111230.0610.0000.4580.000 elements.py:3818(_bind_param) 222690.0580.0000.0970.000 compiler.py:2901(quote) 111250.0580.0000.2610.000 compiler.py:1246(_truncated_identifier) 111230.0580.0001.2990.000 annotation.py:100(__eq__) 111230.0560.0001.0660.000 type_api.py:60(operate) 563860.0550.0000.0550.000 {method 'append' of 'list' objects} 111260.0550.0000.1680.000 elements.py:3962(__new__) 111230.0550.0000.3340.000 compiler.py:1233(_truncate_bindparam) 111250.0540.0000.1920.000 elements.py:4073(apply_map) 112060.0510.0000.0890.000 elements.py:3918(__new__) 222460.0510.0000.0510.000 elements.py:640(self_group) 22250/55670.0490.0001.4930.000 compiler.py:804() 20.0480.0240.0490.024 {built-in method connect} 111230.0470.0001.3860.000 properties.py:269(operate) 8689/31280.0450.0001.5080.000 {method 'join' of 'str' objects} 111230.0450.0000.0760.000 compiler.py:959(_get_operator_dispatch) 111230.0450.0000.0560.000 compiler.py:1271(bindparam_string) 111230.0400.0000.4110.000 annotation.py:78(_compiler_dispatch) 15610/153220.0380.0000.0430.000 {built-in method hasattr} 55670.0380.0000.2940.000 elements.py:1883() 111250.0360.0000.1380.000 _collections.py:728(__missing__) 166910.0350.0000.0900.000 elements.py:4216(_expression_literal_as_text) 55610.0350.0001.5390.000 db_base.py:174() 167040.0330.0000.0550.000 elements.py:4220(_literal_as_text) 111230.0330.0000.0730.000 type_api.py:504(coerce_compared_value) 31584/314690.0310.0000.0320.000 {built-in method len} 111230.0300.0000.1650.000 elements.py:2986(self_group) 55620.0300.0000.1870.000 result.py:1156(fetchone) 111310.0300.0000.0770.000 compiler.py:494() 111230.0290.0000.0470.000 type_api.py:452(_cached_bind_processor) Any guidance to how I can improve the performance of this solution and remain DB neutral? On Wednesday, August 30, 2017 at 4:43:01 AM UTC-4, Simon King wrote: > > On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenzie> wrote: > > I have a query I have constructed and I had to deal with a
Re: [sqlalchemy] Is Textual SQL DB neutral
On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenziewrote: > I have a query I have constructed and I had to deal with a composite primary > key to select items; > > q = s.query(cls) > or_cond = [] > for x in id_batch: > > pkf = [text(f + "=='" + v + "'") for f,v in zip(cls.SQL_PK, > x)] > and_cond = (and_(*pkf)) > or_cond.append(and_cond) > > > q = q.filter(or_(*or_cond)).delete(synchronize_session=False) > > cls.SQL_PK is a tuple of the primary key fields for the model described by > class. This is a class method that is part of a inherited class to the > model > > The current target is SQL Server. My concern is using text('field = > 'value'), is that going to work for other DB targets like say postgres? > > The first round of doing this I tried using a > tuple_(*cls.SQL_PK).in_(id_batch), but that did not work and the resulting > SQL id not work in SSMS leading me to believe that SQL Server (or at least > the version we are using) does not support tuples. > Textual SQL is not DB-neutral in general. Luckily, in this case you shouldn't even need it. Try something like this: for x in id_batch: cols = [getattr(cls, colname) for colname in cls.SQL_PK] pkf = [(col == v) for (col, v) in zip(cols, x)] and_cond = and_(*pkf) or_cond.append(and_cond) ie. use "getattr" to retrieve the actual column property from the class. Hope that helps, Simon -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
[sqlalchemy] Is Textual SQL DB neutral
I have a query I have constructed and I had to deal with a composite primary key to select items; q = s.query(cls) or_cond = [] for x in id_batch: pkf = [text(f + "=='" + v + "'") for f,v in zip(cls.SQL_PK, x)] and_cond = (and_(*pkf)) or_cond.append(and_cond) q = q.filter(or_(*or_cond)).delete(synchronize_session=False) cls.SQL_PK is a tuple of the primary key fields for the model described by class. This is a class method that is part of a inherited class to the model The current target is SQL Server. My concern is using text('field = 'value'), is that going to work for other DB targets like say postgres? The first round of doing this I tried using a tuple_(*cls.SQL_PK).in_(id_batch), but that did not work and the resulting SQL id not work in SSMS leading me to believe that SQL Server (or at least the version we are using) does not support tuples. Ken -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.