Mike,

No, I'm not calling execute 10k times, but I am doing what they do in your 
example with test_core_insert

My code for that is:
            log.info('Executing sqlalchemy insert() with bindparams as 
execute param - session.execute(__table__.insert(), bindparams)')
            
sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), 
data)
            log.info('Committing')
            sql.session.commit()

where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a 
regular [dict(), dict(),...] too with no gains; this method is the fastest 
outside of direct or near direct DBAPI calls at about 8 seconds.

However I read somewhere (can't find the source again to cite, but I 
thought it was the sqlalchemy docs, I may be just not remembering 
correctly) this query chains together individual INSERT statements instead 
of doing a multi-insert
i.e. insert into target (a, b, ...) values (c, d, ...); insert into target 
(a, b, ...) values (e, f, ...); ... instead of a insert into target (a, b, 
...) values (c, d, ...), (e, f, ...)
where the latter is the most optimal TSQL

If I'm wrong there and it does do a multi-insert, what is the difference 
between execute(__table__.insert().values(bindparams)) and 
execute(__table__.insert(), bindparams), and why would the former be ~26 
seconds slower in this use case?




On Thursday, October 4, 2018 at 2:05:59 PM UTC-5, Mike Bayer wrote:
>
> On Thu, Oct 4, 2018 at 2:45 PM <adam....@gmail.com <javascript:>> wrote: 
> > 
> > Hi all, 
> > 
> > Having some very and major performance issues using sqlalchemy to insert 
> data, been trying many different ways but none seem to hold even a close 
> candle to raw MySQLdb connections 
> > 
> > Versions: 
> > sqlalchemy: 1.2.9 
> > MySQLdb: 1.3.13 (bound with mysql+mysqldb connector) 
> > MySQL: 14.14 
> > Python: 3.6.5 
> > Ubuntu 18.04 x86_64 
> > 
> > Server: 
> > 8GB ram 
> > 2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi 
> > 
> > Connected to localhost 
> > 
> > Base data info: 
> > 2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 
> columns, approx 6,252,460 bytes of raw data... 
> > 
> > Pre-formatted session execute ~2 seconds: 
> > 2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped 
> value()s - insert into x (a,b, ...) values (escape_string(c), 
> escape_string(d), ..) (escape_string(e), escape_string(f), ...) 
> > 2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - 
> session.execute(tsql) 
> > 2018-10-04 13:19:59,435 - INFO - Committing 
> > 
> > Raw MySQLdb API ~1 second: 
> > 2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via 
> MySQLdb API - engine.raw_connection(); cursor.execute(tsql) 
> > 2018-10-04 13:20:00,326 - INFO - Committing 
> > 
> > Bindparam formatted TSQL converted to text() executed with bindparams as 
> a dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds 
> total: 
> > 2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, 
> b, ...) values (:a1, :b1, ..), (:a2, :b2, ...) 
> > 2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL 
> bindparams - {a1: d, b1: e, a2: f, b2: g} 
> > 2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql) 
> > 2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - 
> session.execute(text, bindparams) 
> > 2018-10-04 13:20:33,552 - INFO - Committing 
> > 
> > Bindparam formatted TSQL NOT converted to text, executed with bindparams 
> as a dict: ~33 seconds 
> > 2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - 
> session.execute(tsql, bindparams) 
> > 2018-10-04 13:21:06,307 - INFO - Committing 
> > 
> > Using a sqlalchemy class table insert() with using .values() ~34 
> seconds: 
> > 2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ 
> bindparams as values - 
> session.execute(__table__.insert().values(bindparams)) 
> > 2018-10-04 13:21:40,808 - INFO - Committing 
> > 
> > Using a sqlalchemy class table insert() with passing bindparams to 
> session.execute() ~8 seconds: 
> > 2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with 
> bindparams as execute param - session.execute(__table__.insert(), 
> bindparams) 
> > 2018-10-04 13:21:48,084 - INFO - Committing 
> > 
> > Obviously the raw db api will be the fastest as there's little to no 
> processing or overhead, I would think session.execute(str) would come 
> closer to matching that, but its not too far off. 
>
> A Core insert should be only  slightly slower than the raw DBAPI, 
> however, if you're inserting lots of rows, you need to be using 
> executemany style, that is, a single execute() call with a list of 
> parameters.   There's no code examples here so I don't know what 
> you're actually doing, if you are running session.execute() itself 10K 
> times, that's the wrong way to do it.  Check out 
>
> https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
>  
> for usage examples how to bulk-insert for different APIs including 
> Core (test_core_insert specifically).     Additionally, I don't know 
> if you are using bound parameters with your raw MySQL code or not, and 
> if so, is there some unicode processing going on in the SQLAlchemy 
> version that might be slowing things down. 
>
> A complete code example like those you see in 
>
> https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
>  
> will allow me to see what you are doing. 
>
>
> > 
> > Having sqlalchemy.text(str) take 12 seconds to execute seems rather 
> excessive, but assuming that's the hang with the other two 
> execute(__table__.insert().values(bindparams)) statement too, however why 
> would excute(str, bindparams) do this? Does it shadow text() as well if 
> arg0 is a str? 
> > 
> > Now the oddest part of all, doing 10,550 individual inserts in the last 
> example is roughly 26 seconds faster than a single transaction with 10,550 
> sets of values. This just does not make transactional sense. 
> > 
> > Is there something I can be doing better while utilizing core, or am I 
> over-expecting how sqlalchemy to perform with datasets like this? 
> > 
> > 
> > Hopefully this is enough detail, but I can provide more upon request. 
> > 
> > Thank you! 
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to sqlalchemy+...@googlegroups.com <javascript:>. 
> > To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:>. 
> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to