Re: [sqlalchemy] Very slow inserts when using sqlalchemy core

2018-10-04 Thread adam . botbyl
Mike,

Er, yeah, not TSQL, my old MSSQL days are showing :D meant a str() of sql

I'm saying bindparms as the required input format of the params used to 
re-bind into the sql in the optimal way, for example 
execute(__table__.insert(), bindparams) would represent a list of dicts.

I did a packet capture of execute(__table__.insert(), list(dict()) and it 
was doing the execute many calls (insert ... (a, b) values (c, d), (e, f). 
so there was no hangup there.
I also just re-coded this to call mysqldb and run executemany(str, 
[tuple()], it does take about the same amount of time (~8s) as 
execute(__table__.insert(), list(dict())) so there must be something going 
on in the DBAPI thats unoptimal, as I can generate the proper SQL and 
insert it in ~3 seconds with a normal .execute(str).

I tried the test case with PyMySQL with just the execute(str) case and it 
took about 2 seconds, so I didn't give it much more attention; i'll try it 
again with an executemany though.
2018-10-04 17:47:02,373 - INFO - Executing for 10,550 inserts, 89 columns, 
approx 6,252,460 bytes of raw data...
2018-10-04 17:47:02,373 - INFO - Generating datasets
2018-10-04 17:47:02,550 - INFO - Executing with 
raw_connection().executemany(str, [tuple(a, b, ...)]
2018-10-04 17:47:10,061 - INFO - Committing

log.info('Generating datasets')
sql_str = 'INSERT IGNORE INTO table ({}) 
VALUES({})'.format(','.join(data[0].keys()), ','.join(['%s' for i in 
range(len(data[0].keys()))]))
sql_data = [tuple(value for value in list(part.values())) for 
part in data]
log.info('Executing with raw_connection().executemany(str, 
[tuple(a, b, ...)]')
conn = sql.engine.raw_connection()
cur = conn.cursor()
cur.executemany(sql_str, sql_data)
log.info('Committing')
conn.commit()


I still however don't get the difference between sqlalchemy's 
execute(__table__.insert().values(dict())) though and why it would be so 
much slower than execute(__table__.insert(), dict()), nor why execute(str, 
dict()) is that slow as well.



On Thursday, October 4, 2018 at 4:47:31 PM UTC-5, Mike Bayer wrote:
>
> On Thu, Oct 4, 2018 at 4:04 PM > wrote: 
> > 
> > Mike, 
> > 
> > No, I'm not calling execute 10k times, but I am doing what they do in 
> your example with test_core_insert 
> > 
> > My code for that is: 
> > log.info('Executing sqlalchemy insert() with bindparams as 
> execute param - session.execute(__table__.insert(), bindparams)') 
> > 
> sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), 
> data) 
> > log.info('Committing') 
> > sql.session.commit() 
> > 
> > where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a 
> regular [dict(), dict(),...] too with no gains; this method is the fastest 
> outside of direct or near direct DBAPI calls at about 8 seconds. 
> > 
> > However I read somewhere (can't find the source again to cite, but I 
> thought it was the sqlalchemy docs, I may be just not remembering 
> correctly) this query chains together individual INSERT statements instead 
> of doing a multi-insert 
> > i.e. insert into target (a, b, ...) values (c, d, ...); insert into 
> target (a, b, ...) values (e, f, ...); ... instead of a insert into target 
> (a, b, ...) values (c, d, ...), (e, f, ...) 
> > where the latter is the most optimal TSQL 
>
> I'm not sure what "TSQL" means in this context. 
>
> SQLAlchemy uses cursor.executemany() to invoke the statement just 
> once.  the DBAPI decides how it wants to batch it. Regardless, it 
> should move very fast.   On the Postgresql side, there's more of a big 
> speed difference between the way psycopg2 executes a statement 
> normally, and some special modes it provides that allow it to chain 
> together multiple statements in one batch.   But that's still behind 
> the executemany() call. 
>
>
> > 
> > If I'm wrong there and it does do a multi-insert, what is the difference 
> between execute(__table__.insert().values(bindparams)) and 
> execute(__table__.insert(), bindparams), and why would the former be ~26 
> seconds slower in this use case? 
>
> What is "bindparams" here, that is a list of many tuples ?If so 
> then yes, the DBAPI is not batching efficiently in the latter case. 
> Try PyMySQL or mysqlclient. 
>
>
>
>
>
> > 
> > 
> > 
> > 
> > On Thursday, October 4, 2018 at 2:05:59 PM UTC-5, Mike Bayer wrote: 
> >> 
> >> On Thu, Oct 4, 2018 at 2:45 PM  wrote: 
> >> > 
> >> > Hi all, 
> >> > 
> >> > Having some very and major performance issues using sqlalchemy to 
> insert data, been trying many different ways but none seem to hold even a 
> close candle to raw MySQLdb connections 
> >> > 
> >> > Versions: 
> >> > sqlalchemy: 1.2.9 
> >> > MySQLdb: 1.3.13 (bound with mysql+mysqldb connector) 
> >> > MySQL: 14.14 
> >> > Python: 3.6.5 
> >> > Ubuntu 18.04 x86_64 
> >> > 
> >> > Server: 
> >> > 8GB ram 
> >> > 2 

Re: [sqlalchemy] Very slow inserts when using sqlalchemy core

2018-10-04 Thread Mike Bayer
On Thu, Oct 4, 2018 at 4:04 PM  wrote:
>
> Mike,
>
> No, I'm not calling execute 10k times, but I am doing what they do in your 
> example with test_core_insert
>
> My code for that is:
> log.info('Executing sqlalchemy insert() with bindparams as 
> execute param - session.execute(__table__.insert(), bindparams)')
> 
> sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), 
> data)
> log.info('Committing')
> sql.session.commit()
>
> where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a 
> regular [dict(), dict(),...] too with no gains; this method is the fastest 
> outside of direct or near direct DBAPI calls at about 8 seconds.
>
> However I read somewhere (can't find the source again to cite, but I thought 
> it was the sqlalchemy docs, I may be just not remembering correctly) this 
> query chains together individual INSERT statements instead of doing a 
> multi-insert
> i.e. insert into target (a, b, ...) values (c, d, ...); insert into target 
> (a, b, ...) values (e, f, ...); ... instead of a insert into target (a, b, 
> ...) values (c, d, ...), (e, f, ...)
> where the latter is the most optimal TSQL

I'm not sure what "TSQL" means in this context.

SQLAlchemy uses cursor.executemany() to invoke the statement just
once.  the DBAPI decides how it wants to batch it. Regardless, it
should move very fast.   On the Postgresql side, there's more of a big
speed difference between the way psycopg2 executes a statement
normally, and some special modes it provides that allow it to chain
together multiple statements in one batch.   But that's still behind
the executemany() call.


>
> If I'm wrong there and it does do a multi-insert, what is the difference 
> between execute(__table__.insert().values(bindparams)) and 
> execute(__table__.insert(), bindparams), and why would the former be ~26 
> seconds slower in this use case?

What is "bindparams" here, that is a list of many tuples ?If so
then yes, the DBAPI is not batching efficiently in the latter case.
Try PyMySQL or mysqlclient.





>
>
>
>
> On Thursday, October 4, 2018 at 2:05:59 PM UTC-5, Mike Bayer wrote:
>>
>> On Thu, Oct 4, 2018 at 2:45 PM  wrote:
>> >
>> > Hi all,
>> >
>> > Having some very and major performance issues using sqlalchemy to insert 
>> > data, been trying many different ways but none seem to hold even a close 
>> > candle to raw MySQLdb connections
>> >
>> > Versions:
>> > sqlalchemy: 1.2.9
>> > MySQLdb: 1.3.13 (bound with mysql+mysqldb connector)
>> > MySQL: 14.14
>> > Python: 3.6.5
>> > Ubuntu 18.04 x86_64
>> >
>> > Server:
>> > 8GB ram
>> > 2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi
>> >
>> > Connected to localhost
>> >
>> > Base data info:
>> > 2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 columns, 
>> > approx 6,252,460 bytes of raw data...
>> >
>> > Pre-formatted session execute ~2 seconds:
>> > 2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped 
>> > value()s - insert into x (a,b, ...) values (escape_string(c), 
>> > escape_string(d), ..) (escape_string(e), escape_string(f), ...)
>> > 2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - 
>> > session.execute(tsql)
>> > 2018-10-04 13:19:59,435 - INFO - Committing
>> >
>> > Raw MySQLdb API ~1 second:
>> > 2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via 
>> > MySQLdb API - engine.raw_connection(); cursor.execute(tsql)
>> > 2018-10-04 13:20:00,326 - INFO - Committing
>> >
>> > Bindparam formatted TSQL converted to text() executed with bindparams as a 
>> > dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds 
>> > total:
>> > 2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, b, 
>> > ...) values (:a1, :b1, ..), (:a2, :b2, ...)
>> > 2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL bindparams 
>> > - {a1: d, b1: e, a2: f, b2: g}
>> > 2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql)
>> > 2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - 
>> > session.execute(text, bindparams)
>> > 2018-10-04 13:20:33,552 - INFO - Committing
>> >
>> > Bindparam formatted TSQL NOT converted to text, executed with bindparams 
>> > as a dict: ~33 seconds
>> > 2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - 
>> > session.execute(tsql, bindparams)
>> > 2018-10-04 13:21:06,307 - INFO - Committing
>> >
>> > Using a sqlalchemy class table insert() with using .values() ~34 seconds:
>> > 2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ bindparams 
>> > as values - session.execute(__table__.insert().values(bindparams))
>> > 2018-10-04 13:21:40,808 - INFO - Committing
>> >
>> > Using a sqlalchemy class table insert() with passing bindparams to 
>> > session.execute() ~8 seconds:
>> > 2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with 
>> > bindparams as 

Re: [sqlalchemy] Very slow inserts when using sqlalchemy core

2018-10-04 Thread adam . botbyl
Mike,

No, I'm not calling execute 10k times, but I am doing what they do in your 
example with test_core_insert

My code for that is:
log.info('Executing sqlalchemy insert() with bindparams as 
execute param - session.execute(__table__.insert(), bindparams)')

sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), 
data)
log.info('Committing')
sql.session.commit()

where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a 
regular [dict(), dict(),...] too with no gains; this method is the fastest 
outside of direct or near direct DBAPI calls at about 8 seconds.

However I read somewhere (can't find the source again to cite, but I 
thought it was the sqlalchemy docs, I may be just not remembering 
correctly) this query chains together individual INSERT statements instead 
of doing a multi-insert
i.e. insert into target (a, b, ...) values (c, d, ...); insert into target 
(a, b, ...) values (e, f, ...); ... instead of a insert into target (a, b, 
...) values (c, d, ...), (e, f, ...)
where the latter is the most optimal TSQL

If I'm wrong there and it does do a multi-insert, what is the difference 
between execute(__table__.insert().values(bindparams)) and 
execute(__table__.insert(), bindparams), and why would the former be ~26 
seconds slower in this use case?




On Thursday, October 4, 2018 at 2:05:59 PM UTC-5, Mike Bayer wrote:
>
> On Thu, Oct 4, 2018 at 2:45 PM > wrote: 
> > 
> > Hi all, 
> > 
> > Having some very and major performance issues using sqlalchemy to insert 
> data, been trying many different ways but none seem to hold even a close 
> candle to raw MySQLdb connections 
> > 
> > Versions: 
> > sqlalchemy: 1.2.9 
> > MySQLdb: 1.3.13 (bound with mysql+mysqldb connector) 
> > MySQL: 14.14 
> > Python: 3.6.5 
> > Ubuntu 18.04 x86_64 
> > 
> > Server: 
> > 8GB ram 
> > 2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi 
> > 
> > Connected to localhost 
> > 
> > Base data info: 
> > 2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 
> columns, approx 6,252,460 bytes of raw data... 
> > 
> > Pre-formatted session execute ~2 seconds: 
> > 2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped 
> value()s - insert into x (a,b, ...) values (escape_string(c), 
> escape_string(d), ..) (escape_string(e), escape_string(f), ...) 
> > 2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - 
> session.execute(tsql) 
> > 2018-10-04 13:19:59,435 - INFO - Committing 
> > 
> > Raw MySQLdb API ~1 second: 
> > 2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via 
> MySQLdb API - engine.raw_connection(); cursor.execute(tsql) 
> > 2018-10-04 13:20:00,326 - INFO - Committing 
> > 
> > Bindparam formatted TSQL converted to text() executed with bindparams as 
> a dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds 
> total: 
> > 2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, 
> b, ...) values (:a1, :b1, ..), (:a2, :b2, ...) 
> > 2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL 
> bindparams - {a1: d, b1: e, a2: f, b2: g} 
> > 2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql) 
> > 2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - 
> session.execute(text, bindparams) 
> > 2018-10-04 13:20:33,552 - INFO - Committing 
> > 
> > Bindparam formatted TSQL NOT converted to text, executed with bindparams 
> as a dict: ~33 seconds 
> > 2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - 
> session.execute(tsql, bindparams) 
> > 2018-10-04 13:21:06,307 - INFO - Committing 
> > 
> > Using a sqlalchemy class table insert() with using .values() ~34 
> seconds: 
> > 2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ 
> bindparams as values - 
> session.execute(__table__.insert().values(bindparams)) 
> > 2018-10-04 13:21:40,808 - INFO - Committing 
> > 
> > Using a sqlalchemy class table insert() with passing bindparams to 
> session.execute() ~8 seconds: 
> > 2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with 
> bindparams as execute param - session.execute(__table__.insert(), 
> bindparams) 
> > 2018-10-04 13:21:48,084 - INFO - Committing 
> > 
> > Obviously the raw db api will be the fastest as there's little to no 
> processing or overhead, I would think session.execute(str) would come 
> closer to matching that, but its not too far off. 
>
> A Core insert should be only  slightly slower than the raw DBAPI, 
> however, if you're inserting lots of rows, you need to be using 
> executemany style, that is, a single execute() call with a list of 
> parameters.   There's no code examples here so I don't know what 
> you're actually doing, if you are running session.execute() itself 10K 
> times, that's the wrong way to do it.  Check out 
>
> 

Re: [sqlalchemy] Very slow inserts when using sqlalchemy core

2018-10-04 Thread Mike Bayer
On Thu, Oct 4, 2018 at 2:45 PM  wrote:
>
> Hi all,
>
> Having some very and major performance issues using sqlalchemy to insert 
> data, been trying many different ways but none seem to hold even a close 
> candle to raw MySQLdb connections
>
> Versions:
> sqlalchemy: 1.2.9
> MySQLdb: 1.3.13 (bound with mysql+mysqldb connector)
> MySQL: 14.14
> Python: 3.6.5
> Ubuntu 18.04 x86_64
>
> Server:
> 8GB ram
> 2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi
>
> Connected to localhost
>
> Base data info:
> 2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 columns, 
> approx 6,252,460 bytes of raw data...
>
> Pre-formatted session execute ~2 seconds:
> 2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped 
> value()s - insert into x (a,b, ...) values (escape_string(c), 
> escape_string(d), ..) (escape_string(e), escape_string(f), ...)
> 2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - 
> session.execute(tsql)
> 2018-10-04 13:19:59,435 - INFO - Committing
>
> Raw MySQLdb API ~1 second:
> 2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via MySQLdb 
> API - engine.raw_connection(); cursor.execute(tsql)
> 2018-10-04 13:20:00,326 - INFO - Committing
>
> Bindparam formatted TSQL converted to text() executed with bindparams as a 
> dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds 
> total:
> 2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, b, 
> ...) values (:a1, :b1, ..), (:a2, :b2, ...)
> 2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL bindparams - 
> {a1: d, b1: e, a2: f, b2: g}
> 2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql)
> 2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - 
> session.execute(text, bindparams)
> 2018-10-04 13:20:33,552 - INFO - Committing
>
> Bindparam formatted TSQL NOT converted to text, executed with bindparams as a 
> dict: ~33 seconds
> 2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - 
> session.execute(tsql, bindparams)
> 2018-10-04 13:21:06,307 - INFO - Committing
>
> Using a sqlalchemy class table insert() with using .values() ~34 seconds:
> 2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ bindparams as 
> values - session.execute(__table__.insert().values(bindparams))
> 2018-10-04 13:21:40,808 - INFO - Committing
>
> Using a sqlalchemy class table insert() with passing bindparams to 
> session.execute() ~8 seconds:
> 2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with 
> bindparams as execute param - session.execute(__table__.insert(), bindparams)
> 2018-10-04 13:21:48,084 - INFO - Committing
>
> Obviously the raw db api will be the fastest as there's little to no 
> processing or overhead, I would think session.execute(str) would come closer 
> to matching that, but its not too far off.

A Core insert should be only  slightly slower than the raw DBAPI,
however, if you're inserting lots of rows, you need to be using
executemany style, that is, a single execute() call with a list of
parameters.   There's no code examples here so I don't know what
you're actually doing, if you are running session.execute() itself 10K
times, that's the wrong way to do it.  Check out
https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
for usage examples how to bulk-insert for different APIs including
Core (test_core_insert specifically). Additionally, I don't know
if you are using bound parameters with your raw MySQL code or not, and
if so, is there some unicode processing going on in the SQLAlchemy
version that might be slowing things down.

A complete code example like those you see in
https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
will allow me to see what you are doing.


>
> Having sqlalchemy.text(str) take 12 seconds to execute seems rather 
> excessive, but assuming that's the hang with the other two 
> execute(__table__.insert().values(bindparams)) statement too, however why 
> would excute(str, bindparams) do this? Does it shadow text() as well if arg0 
> is a str?
>
> Now the oddest part of all, doing 10,550 individual inserts in the last 
> example is roughly 26 seconds faster than a single transaction with 10,550 
> sets of values. This just does not make transactional sense.
>
> Is there something I can be doing better while utilizing core, or am I 
> over-expecting how sqlalchemy to perform with datasets like this?
>
>
> Hopefully this is enough detail, but I can provide more upon request.
>
> Thank you!
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" 

[sqlalchemy] Very slow inserts when using sqlalchemy core

2018-10-04 Thread adam . botbyl
Hi all, 

Having some very and major performance issues using sqlalchemy to insert 
data, been trying many different ways but none seem to hold even a close 
candle to raw MySQLdb connections

Versions:
sqlalchemy: 1.2.9
MySQLdb: 1.3.13 (bound with mysql+mysqldb connector)
MySQL: 14.14
Python: 3.6.5
Ubuntu 18.04 x86_64

Server:
8GB ram
2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi

Connected to localhost

Base data info:
2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 columns, 
approx 6,252,460 bytes of raw data...

Pre-formatted session execute ~2 seconds:
2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped 
value()s - insert into x (a,b, ...) values (escape_string(c), 
escape_string(d), ..) (escape_string(e), escape_string(f), ...)
2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - 
session.execute(tsql)
2018-10-04 13:19:59,435 - INFO - Committing

Raw MySQLdb API ~1 second:
2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via 
MySQLdb API - engine.raw_connection(); cursor.execute(tsql)
2018-10-04 13:20:00,326 - INFO - Committing

Bindparam formatted TSQL converted to text() executed with bindparams as a 
dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds 
total:
2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, b, 
...) values (:a1, :b1, ..), (:a2, :b2, ...)
2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL bindparams 
- {a1: d, b1: e, a2: f, b2: g}
2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql)
2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - 
session.execute(text, bindparams)
2018-10-04 13:20:33,552 - INFO - Committing

Bindparam formatted TSQL NOT converted to text, executed with bindparams as 
a dict: ~33 seconds
2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - 
session.execute(tsql, bindparams)
2018-10-04 13:21:06,307 - INFO - Committing

Using a sqlalchemy class table insert() with using .values() ~34 seconds:
2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ bindparams 
as values - session.execute(__table__.insert().values(bindparams))
2018-10-04 13:21:40,808 - INFO - Committing

Using a sqlalchemy class table insert() with passing bindparams to 
session.execute() ~8 seconds:
2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with 
bindparams as execute param - session.execute(__table__.insert(), 
bindparams)
2018-10-04 13:21:48,084 - INFO - Committing

Obviously the raw db api will be the fastest as there's little to no 
processing or overhead, I would think session.execute(str) would come 
closer to matching that, but its not too far off.

Having sqlalchemy.text(str) take 12 seconds to execute seems rather 
excessive, but assuming that's the hang with the other two 
execute(__table__.insert().values(bindparams)) statement too, however why 
would excute(str, bindparams) do this? Does it shadow text() as well if 
arg0 is a str?

Now the oddest part of all, doing 10,550 individual inserts in the last 
example is roughly 26 seconds faster than a single transaction with 10,550 
sets of values. This just does not make transactional sense.

Is there something I can be doing better while utilizing core, or am I 
over-expecting how sqlalchemy to perform with datasets like this?


Hopefully this is enough detail, but I can provide more upon request.

Thank you!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Issue on inserts statements for pg explain recipe

2018-10-04 Thread Mike Bayer
On Thu, Oct 4, 2018 at 5:10 AM  wrote:
>
> That's great ! Thanks a lot !
>
> Btw it seems that the
>
> self.inline = getattr(stmt, 'inline', None)
>
> is no longer necessary, I tryed my insert exemple with it commented and it 
> still runned fine.

well there's also an "inline" argument for Insert() so maybe it was
trying to preserve that, I don't remember.

>
>
> Le jeudi 4 octobre 2018 05:23:02 UTC+2, Mike Bayer a écrit :
>>
>> The insert is by itself not a RETURNING and doesn't return any rows.
>> The actual exception raised is:
>>
>> sqlalchemy.exc.ResourceClosedError: This result object does not return
>> rows. It has been closed automatically.
>>
>> and you'll get this just if you run the insert by itself:
>>
>> returning_query = Foo.__table__.insert()
>> print(sess.execute(returning_query, {'id': 1}).fetchall())
>>
>> Python 3 is just also showing you the exception we catch internally
>> which is the "'NoneType' object has no attribute 'fetchall'"  one, but
>> that is normal, we are catching that to detect that this result
>> doesn't return any rows.
>>
>> the cursor is getting closed because for any DML that doesn't return
>> rows, it closes the cursor because there's no other point at which it
>> could naturally be closed automatically from the point of view of the
>> resultproxy.
>>
>> If you tell the insert statement to return rows, the cursor doesnt get
>> closed and the explain gets the rows back:
>>
>> returning_query = Foo.__table__.insert().returning(Foo.id)
>>
>> print(sess.execute(explain(returning_query), {"id": 1}).fetchall())
>>
>> EXPLAIN INSERT INTO foo (id) VALUES (%(id)s) RETURNING foo.id
>> 2018-10-03 23:15:38,272 INFO sqlalchemy.engine.base.Engine {'id': 1}
>> [('Insert on foo  (cost=0.00..0.01 rows=1 width=4)',), ('  ->  Result
>> (cost=0.00..0.01 rows=1 width=4)',)]
>>
>> to get the insert() to return the rows without returning being
>> present, we have to flip off the flags that are telling the execution
>> context that this is an INSERT, or more generally crud, that are
>> making it autoclose the cursor:
>>
>> @compiles(explain, 'postgresql')
>> def pg_explain(element, compiler, **kw):
>> text = "EXPLAIN "
>> if element.analyze:
>> text += "ANALYZE "
>> text += compiler.process(element.statement, **kw)
>> compiler.isinsert = compiler.isupdate = compiler.isdelete = False
>> return text
>>
>> that seems to fix the codepath that is looking for _returning also, so
>> I'll add that.
>>
>>
>>
>> On Wed, Oct 3, 2018 at 5:04 AM  wrote:
>> >
>> > Hello,
>> >
>> > I use the recipe for explain as described 
>> > (https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/Explain), which 
>> > does the job completely for select statements (I have no idea how it works 
>> > however :p) in version 1.2.11
>> > The recipe at the moment is
>> >
>> > from sqlalchemy import *
>> > from sqlalchemy.ext.compiler import compiles
>> > from sqlalchemy.sql.expression import Executable, ClauseElement, 
>> > _literal_as_text
>> >
>> >
>> > class explain(Executable, ClauseElement):
>> > def __init__(self, stmt, analyze=False):
>> > self.statement = _literal_as_text(stmt)
>> > self.analyze = analyze
>> > # helps with INSERT statements
>> > self.inline = getattr(stmt, 'inline', None)
>> >
>> >
>> > @compiles(explain, 'postgresql')
>> > def pg_explain(element, compiler, **kw):
>> > text = "EXPLAIN "
>> > if element.analyze:
>> > text += "ANALYZE "
>> > text += compiler.process(element.statement, **kw)
>> > return text
>> >
>> >
>> > The recipe is not up to date, as it doesn't work on inserts, however by 
>> > replacing
>> >
>> > self.inline = getattr(stmt, 'inline', None)
>> > by
>> > self._returning = getattr(stmt, '_returning', None)
>> > self.table = getattr(stmt, 'table', None)
>> >
>> > It works on inserts with returning, however it crashes when I try to 
>> > fetchall() on explain of simple inserts with a "AttributeError: 'NoneType' 
>> > object has no attribute 'fetchall'"
>> > Below a code which have an error
>> >
>> >
>> >
>> > from sqlalchemy import *
>> > from sqlalchemy.ext.compiler import compiles
>> > from sqlalchemy.ext.declarative import declarative_base
>> > from sqlalchemy.orm import sessionmaker, aliased
>> >
>> >
>> >
>> >
>> > class explain(Executable, ClauseElement):
>> > def __init__(self, stmt, analyze=False):
>> > self.statement = _literal_as_text(stmt)
>> > self.analyze = analyze
>> > # helps with INSERT statements
>> > self._returning = getattr(stmt, '_returning', None)
>> >
>> >
>> > self.table = getattr(stmt, 'table', None)
>> >
>> >
>> >
>> >
>> > @compiles(explain, 'postgresql')
>> > def pg_explain(element, compiler, **kw):
>> > text = "EXPLAIN "
>> > if element.analyze:
>> > text += "ANALYZE "
>> > text += compiler.process(element.statement, **kw)
>> > return text
>> >
>> >
>> >
>> >
>> > Base = 

Re: [sqlalchemy] Re: Hybrid Property vs Hybrid Method Expression Names

2018-10-04 Thread Mike Bayer
a PR would at least remind me to look into it even if i dont use your
suggested verbiage as is, thanks!
On Thu, Oct 4, 2018 at 4:07 AM  wrote:
>
> To follow this up - what would be the best way to get these extra dragons in? 
> I would be happy to submit a PR or something if that is easier.
>
> On Friday, September 14, 2018 at 10:32:52 AM UTC+2, ja...@cryptosense.com 
> wrote:
>>
>> Thanks for the help - I had missed the "copy vs modifying in place" 
>> difference between hybrid_method and hybrid_property.
>>
>> I think adding another dragon would be helpful here, probably located in 
>> http://docs.sqlalchemy.org/en/latest/orm/extensions/hybrid.html#defining-expression-behavior-distinct-from-attribute-behavior.
>>  I'm happy to move this thread into a docs issue if that would be helpful.
>>
>> On Thursday, September 13, 2018 at 5:45:05 PM UTC+2, Mike Bayer wrote:
>>>
>>> On Thu, Sep 13, 2018 at 7:55 AM,   wrote:
>>> > Update: I have just found
>>> > http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#hybrid-attributes-support-reuse-among-subclasses-redefinition-of-getter
>>> > which documents that getters and setters must have the same name as the
>>> > original expression.
>>> >
>>> > Can I just check that it is expected for expressions to have this
>>> > requirement? If so, is it worth opening a docs issue to add this to the 
>>> > main
>>> > hybrid property docs?
>>>
>>> this is the mechanics of Python, when you say:
>>>
>>> @mything.foobar
>>> def _myotherthing(...)
>>>
>>>
>>> you are assigning to the name "_myotherthing".   Since
>>> @hybrid_property now creates a copy when any modifier is called, the
>>> original hybrid you have at "mything" was not changed.
>>>
>>> All the documentation examples at
>>> http://docs.sqlalchemy.org/en/latest/orm/extensions/hybrid.html
>>> indicate using the same method name for each modification.I can
>>> add more dragons to the hybrid docs as well clarifying that this
>>> naming scheme is intentional and required, if that helps.
>>>
>>>
>>>
>>> >
>>> > --
>>> > SQLAlchemy -
>>> > The Python SQL Toolkit and Object Relational Mapper
>>> >
>>> > http://www.sqlalchemy.org/
>>> >
>>> > To post example code, please provide an MCVE: Minimal, Complete, and
>>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>>> > description.
>>> > ---
>>> > You received this message because you are subscribed to the Google Groups
>>> > "sqlalchemy" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send an
>>> > email to sqlalchemy+...@googlegroups.com.
>>> > To post to this group, send email to sqlal...@googlegroups.com.
>>> > Visit this group at https://groups.google.com/group/sqlalchemy.
>>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Issue on inserts statements for pg explain recipe

2018-10-04 Thread nicolas . rolin
That's great ! Thanks a lot !

Btw it seems that the

self.inline = getattr(stmt, 'inline', None)

is no longer necessary, I tryed my insert exemple with it commented and it 
still runned fine.


Le jeudi 4 octobre 2018 05:23:02 UTC+2, Mike Bayer a écrit :
>
> The insert is by itself not a RETURNING and doesn't return any rows. 
> The actual exception raised is: 
>
> sqlalchemy.exc.ResourceClosedError: This result object does not return 
> rows. It has been closed automatically. 
>
> and you'll get this just if you run the insert by itself: 
>
> returning_query = Foo.__table__.insert() 
> print(sess.execute(returning_query, {'id': 1}).fetchall()) 
>
> Python 3 is just also showing you the exception we catch internally 
> which is the "'NoneType' object has no attribute 'fetchall'"  one, but 
> that is normal, we are catching that to detect that this result 
> doesn't return any rows. 
>
> the cursor is getting closed because for any DML that doesn't return 
> rows, it closes the cursor because there's no other point at which it 
> could naturally be closed automatically from the point of view of the 
> resultproxy. 
>
> If you tell the insert statement to return rows, the cursor doesnt get 
> closed and the explain gets the rows back: 
>
> returning_query = Foo.__table__.insert().returning(Foo.id) 
>
> print(sess.execute(explain(returning_query), {"id": 1}).fetchall()) 
>
> EXPLAIN INSERT INTO foo (id) VALUES (%(id)s) RETURNING foo.id 
> 2018-10-03 23:15:38,272 INFO sqlalchemy.engine.base.Engine {'id': 1} 
> [('Insert on foo  (cost=0.00..0.01 rows=1 width=4)',), ('  ->  Result 
> (cost=0.00..0.01 rows=1 width=4)',)] 
>
> to get the insert() to return the rows without returning being 
> present, we have to flip off the flags that are telling the execution 
> context that this is an INSERT, or more generally crud, that are 
> making it autoclose the cursor: 
>
> @compiles(explain, 'postgresql') 
> def pg_explain(element, compiler, **kw): 
> text = "EXPLAIN " 
> if element.analyze: 
> text += "ANALYZE " 
> text += compiler.process(element.statement, **kw) 
> compiler.isinsert = compiler.isupdate = compiler.isdelete = False 
> return text 
>
> that seems to fix the codepath that is looking for _returning also, so 
> I'll add that. 
>
>
>
> On Wed, Oct 3, 2018 at 5:04 AM > wrote: 
> > 
> > Hello, 
> > 
> > I use the recipe for explain as described (
> https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/Explain), which 
> does the job completely for select statements (I have no idea how it works 
> however :p) in version 1.2.11 
> > The recipe at the moment is 
> > 
> > from sqlalchemy import * 
> > from sqlalchemy.ext.compiler import compiles 
> > from sqlalchemy.sql.expression import Executable, ClauseElement, 
> _literal_as_text 
> > 
> > 
> > class explain(Executable, ClauseElement): 
> > def __init__(self, stmt, analyze=False): 
> > self.statement = _literal_as_text(stmt) 
> > self.analyze = analyze 
> > # helps with INSERT statements 
> > self.inline = getattr(stmt, 'inline', None) 
> > 
> > 
> > @compiles(explain, 'postgresql') 
> > def pg_explain(element, compiler, **kw): 
> > text = "EXPLAIN " 
> > if element.analyze: 
> > text += "ANALYZE " 
> > text += compiler.process(element.statement, **kw) 
> > return text 
> > 
> > 
> > The recipe is not up to date, as it doesn't work on inserts, however by 
> replacing 
> > 
> > self.inline = getattr(stmt, 'inline', None) 
> > by 
> > self._returning = getattr(stmt, '_returning', None) 
> > self.table = getattr(stmt, 'table', None) 
> > 
> > It works on inserts with returning, however it crashes when I try to 
> fetchall() on explain of simple inserts with a "AttributeError: 'NoneType' 
> object has no attribute 'fetchall'" 
> > Below a code which have an error 
> > 
> > 
> > 
> > from sqlalchemy import * 
> > from sqlalchemy.ext.compiler import compiles 
> > from sqlalchemy.ext.declarative import declarative_base 
> > from sqlalchemy.orm import sessionmaker, aliased 
> > 
> > 
> > 
> > 
> > class explain(Executable, ClauseElement): 
> > def __init__(self, stmt, analyze=False): 
> > self.statement = _literal_as_text(stmt) 
> > self.analyze = analyze 
> > # helps with INSERT statements 
> > self._returning = getattr(stmt, '_returning', None) 
> > 
> > 
> > self.table = getattr(stmt, 'table', None) 
> > 
> > 
> > 
> > 
> > @compiles(explain, 'postgresql') 
> > def pg_explain(element, compiler, **kw): 
> > text = "EXPLAIN " 
> > if element.analyze: 
> > text += "ANALYZE " 
> > text += compiler.process(element.statement, **kw) 
> > return text 
> > 
> > 
> > 
> > 
> > Base = declarative_base() 
> > 
> > 
> > class Foo(Base): 
> > __tablename__ = 'foo' 
> > id = Column(Integer, primary_key=True) 
> > 
> > 
> > engine = create_engine(postgres_uri, echo=False) 
> > 
> > 
> > 

Re: [sqlalchemy] Re: Hybrid Property vs Hybrid Method Expression Names

2018-10-04 Thread james
To follow this up - what would be the best way to get these extra dragons 
in? I would be happy to submit a PR or something if that is easier.

On Friday, September 14, 2018 at 10:32:52 AM UTC+2, ja...@cryptosense.com 
wrote:
>
> Thanks for the help - I had missed the "copy vs modifying in place" 
> difference between hybrid_method and hybrid_property.
>
> I think adding another dragon would be helpful here, probably located in 
> http://docs.sqlalchemy.org/en/latest/orm/extensions/hybrid.html#defining-expression-behavior-distinct-from-attribute-behavior.
>  
> I'm happy to move this thread into a docs issue if that would be helpful.
>
> On Thursday, September 13, 2018 at 5:45:05 PM UTC+2, Mike Bayer wrote:
>>
>> On Thu, Sep 13, 2018 at 7:55 AM,   wrote: 
>> > Update: I have just found 
>> > 
>> http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#hybrid-attributes-support-reuse-among-subclasses-redefinition-of-getter
>>  
>> > which documents that getters and setters must have the same name as the 
>> > original expression. 
>> > 
>> > Can I just check that it is expected for expressions to have this 
>> > requirement? If so, is it worth opening a docs issue to add this to the 
>> main 
>> > hybrid property docs? 
>>
>> this is the mechanics of Python, when you say: 
>>
>> @mything.foobar 
>> def _myotherthing(...) 
>>
>>
>> you are assigning to the name "_myotherthing".   Since 
>> @hybrid_property now creates a copy when any modifier is called, the 
>> original hybrid you have at "mything" was not changed. 
>>
>> All the documentation examples at 
>> http://docs.sqlalchemy.org/en/latest/orm/extensions/hybrid.html 
>> indicate using the same method name for each modification.I can 
>> add more dragons to the hybrid docs as well clarifying that this 
>> naming scheme is intentional and required, if that helps. 
>>
>>
>>
>> > 
>> > -- 
>> > SQLAlchemy - 
>> > The Python SQL Toolkit and Object Relational Mapper 
>> > 
>> > http://www.sqlalchemy.org/ 
>> > 
>> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> > description. 
>> > --- 
>> > You received this message because you are subscribed to the Google 
>> Groups 
>> > "sqlalchemy" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an 
>> > email to sqlalchemy+...@googlegroups.com. 
>> > To post to this group, send email to sqlal...@googlegroups.com. 
>> > Visit this group at https://groups.google.com/group/sqlalchemy. 
>> > For more options, visit https://groups.google.com/d/optout. 
>>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.