It would be interesting to see the profile of the textual SQL version.
It looks like most of the time is being spent inside pyodbc, rather
than SQLAlchemy, so I guess it must be something to do with the
processing of bind parameters. How many parameters are being sent in
per query? ie. what is len(id_batch) * len(cls.SQL_PK)?

You could try playing with your batch sizes to see what sort of effect that has.

Simon

On Wed, Aug 30, 2017 at 4:01 PM, Ken MacKenzie <deviloc...@gmail.com> wrote:
> So I implemented this version with one change.  I moved this line:
>
> cols = [getattr(cls, colname) for colname in cls.SQL_PK]
>
> To the beginning before the loop, actually before the outer loop that parses
> the batch.
>
> However, the new version is MUCH slower than the original, so I profiled it
> and this is what I get:
>
>  %prun import db_api_lib.del_test
>          1421296 function calls (1277536 primitive calls) in 620.434 seconds
>
>    Ordered by: internal time
>
>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>        12  615.512   51.293  615.512   51.293 {method 'execute' of
> 'pyodbc.Cursor' objects}
> 208876/208271    0.316    0.000    0.332    0.000 {built-in method
> isinstance}
> 96876/96874    0.184    0.000    0.185    0.000 {built-in method getattr}
>     11123    0.137    0.000    0.242    0.000 elements.py:2927(__init__)
>  38970/10    0.136    0.000    1.502    0.150
> visitors.py:75(_compiler_dispatch)
>     11123    0.128    0.000    0.397    0.000 elements.py:861(__init__)
> 44492/11123    0.127    0.000    1.504    0.000 operators.py:297(__eq__)
>      5568    0.116    0.000    0.617    0.000 elements.py:1860(_construct)
>     11132    0.113    0.000    0.246    0.000 compiler.py:676(visit_column)
>     11123    0.111    0.000    1.039    0.000
> compiler.py:1040(_generate_generic_binary)
>      5567    0.108    0.000    0.108    0.000 {method 'fetchone' of
> 'pyodbc.Cursor' objects}
>     11123    0.099    0.000    0.646    0.000
> default_comparator.py:290(_check_literal)
>     16684    0.095    0.000    0.195    0.000
> operators.py:1007(is_precedent)
>     11123    0.094    0.000    1.010    0.000
> default_comparator.py:22(_boolean_compare)
> 33375/11129    0.090    0.000    1.445    0.000 {built-in method eq}
>     11123    0.075    0.000    0.465    0.000
> compiler.py:1164(visit_bindparam)
>         1    0.075    0.075  619.467  619.467 db_base.py:138(dict_delete)
>     11139    0.073    0.000    0.331    0.000 base.py:1128(decorate)
>     68831    0.070    0.000    0.070    0.000 {method 'get' of 'dict'
> objects}
>         1    0.068    0.068  620.471  620.471 del_test.py:1(<module>)
> 22250/5567    0.067    0.000    1.482    0.000 compiler.py:806(<genexpr>)
>     11125    0.066    0.000    0.102    0.000
> compiler.py:1265(_process_anon)
>     11123    0.065    0.000    1.179    0.000 compiler.py:1006(visit_binary)
>     11123    0.063    0.000    1.263    0.000 base.py:1306(visit_binary)
>     11123    0.061    0.000    0.458    0.000 elements.py:3818(_bind_param)
>     22269    0.058    0.000    0.097    0.000 compiler.py:2901(quote)
>     11125    0.058    0.000    0.261    0.000
> compiler.py:1246(_truncated_identifier)
>     11123    0.058    0.000    1.299    0.000 annotation.py:100(__eq__)
>     11123    0.056    0.000    1.066    0.000 type_api.py:60(operate)
>     56386    0.055    0.000    0.055    0.000 {method 'append' of 'list'
> objects}
>     11126    0.055    0.000    0.168    0.000 elements.py:3962(__new__)
>     11123    0.055    0.000    0.334    0.000
> compiler.py:1233(_truncate_bindparam)
>     11125    0.054    0.000    0.192    0.000 elements.py:4073(apply_map)
>     11206    0.051    0.000    0.089    0.000 elements.py:3918(__new__)
>     22246    0.051    0.000    0.051    0.000 elements.py:640(self_group)
> 22250/5567    0.049    0.000    1.493    0.000 compiler.py:804(<genexpr>)
>         2    0.048    0.024    0.049    0.024 {built-in method connect}
>     11123    0.047    0.000    1.386    0.000 properties.py:269(operate)
> 8689/3128    0.045    0.000    1.508    0.000 {method 'join' of 'str'
> objects}
>     11123    0.045    0.000    0.076    0.000
> compiler.py:959(_get_operator_dispatch)
>     11123    0.045    0.000    0.056    0.000
> compiler.py:1271(bindparam_string)
>     11123    0.040    0.000    0.411    0.000
> annotation.py:78(_compiler_dispatch)
> 15610/15322    0.038    0.000    0.043    0.000 {built-in method hasattr}
>      5567    0.038    0.000    0.294    0.000 elements.py:1883(<listcomp>)
>     11125    0.036    0.000    0.138    0.000
> _collections.py:728(__missing__)
>     16691    0.035    0.000    0.090    0.000
> elements.py:4216(_expression_literal_as_text)
>      5561    0.035    0.000    1.539    0.000 db_base.py:174(<listcomp>)
>     16704    0.033    0.000    0.055    0.000
> elements.py:4220(_literal_as_text)
>     11123    0.033    0.000    0.073    0.000
> type_api.py:504(coerce_compared_value)
> 31584/31469    0.031    0.000    0.032    0.000 {built-in method len}
>     11123    0.030    0.000    0.165    0.000 elements.py:2986(self_group)
>      5562    0.030    0.000    0.187    0.000 result.py:1156(fetchone)
>     11131    0.030    0.000    0.077    0.000 compiler.py:494(<genexpr>)
>     11123    0.029    0.000    0.047    0.000
> type_api.py:452(_cached_bind_processor)
>
> Any guidance to how I can improve the performance of this solution and
> remain DB neutral?
>
> On Wednesday, August 30, 2017 at 4:43:01 AM UTC-4, Simon King wrote:
>>
>> On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenzie <devil...@gmail.com> wrote:
>> > I have a query I have constructed and I had to deal with a composite
>> > primary
>> > key to select items;
>> >
>> >             q = s.query(cls)
>> >             or_cond = []
>> >             for x in id_batch:
>> >
>> >                 pkf = [text(f + "=='" + v + "'") for f,v in
>> > zip(cls.SQL_PK,
>> > x)]
>> >                 and_cond = (and_(*pkf))
>> >                 or_cond.append(and_cond)
>> >
>> >
>> >             q =
>> > q.filter(or_(*or_cond)).delete(synchronize_session=False)
>> >
>> > cls.SQL_PK is a tuple of the primary key fields for the model described
>> > by
>> > class.  This is a class method that is part of a inherited class to the
>> > model
>> >
>> > The current target is SQL Server.  My concern is using text('field =
>> > 'value'), is that going to work for other DB targets like say postgres?
>> >
>> > The first round of doing this I tried using a
>> > tuple_(*cls.SQL_PK).in_(id_batch), but that did not work and the
>> > resulting
>> > SQL id not work in SSMS leading me to believe that SQL Server (or at
>> > least
>> > the version we are using) does not support tuples.
>> >
>>
>> Textual SQL is not DB-neutral in general. Luckily, in this case you
>> shouldn't even need it. Try something like this:
>>
>> for x in id_batch:
>>     cols = [getattr(cls, colname) for colname in cls.SQL_PK]
>>     pkf = [(col == v) for (col, v) in zip(cols, x)]
>>     and_cond = and_(*pkf)
>>     or_cond.append(and_cond)
>>
>> ie. use "getattr" to retrieve the actual column property from the class.
>>
>> Hope that helps,
>>
>> Simon
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to