Hi all, Having some very and major performance issues using sqlalchemy to insert data, been trying many different ways but none seem to hold even a close candle to raw MySQLdb connections
Versions: sqlalchemy: 1.2.9 MySQLdb: 1.3.13 (bound with mysql+mysqldb connector) MySQL: 14.14 Python: 3.6.5 Ubuntu 18.04 x86_64 Server: 8GB ram 2 x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz threads running in VMWare ESXi Connected to localhost Base data info: 2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 columns, approx 6,252,460 bytes of raw data... Pre-formatted session execute ~2 seconds: 2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped value()s - insert into x (a,b, ...) values (escape_string(c), escape_string(d), ..) (escape_string(e), escape_string(f), ...) 2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes) - session.execute(tsql) 2018-10-04 13:19:59,435 - INFO - Committing Raw MySQLdb API ~1 second: 2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via MySQLdb API - engine.raw_connection(); cursor.execute(tsql) 2018-10-04 13:20:00,326 - INFO - Committing Bindparam formatted TSQL converted to text() executed with bindparams as a dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds total: 2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, b, ...) values (:a1, :b1, ..), (:a2, :b2, ...) 2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL bindparams - {a1: d, b1: e, a2: f, b2: g} 2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql) 2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - session.execute(text, bindparams) 2018-10-04 13:20:33,552 - INFO - Committing Bindparam formatted TSQL NOT converted to text, executed with bindparams as a dict: ~33 seconds 2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - session.execute(tsql, bindparams) 2018-10-04 13:21:06,307 - INFO - Committing Using a sqlalchemy class table insert() with using .values() ~34 seconds: 2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ bindparams as values - session.execute(__table__.insert().values(bindparams)) 2018-10-04 13:21:40,808 - INFO - Committing Using a sqlalchemy class table insert() with passing bindparams to session.execute() ~8 seconds: 2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with bindparams as execute param - session.execute(__table__.insert(), bindparams) 2018-10-04 13:21:48,084 - INFO - Committing Obviously the raw db api will be the fastest as there's little to no processing or overhead, I would think session.execute(str) would come closer to matching that, but its not too far off. Having sqlalchemy.text(str) take 12 seconds to execute seems rather excessive, but assuming that's the hang with the other two execute(__table__.insert().values(bindparams)) statement too, however why would excute(str, bindparams) do this? Does it shadow text() as well if arg0 is a str? Now the oddest part of all, doing 10,550 individual inserts in the last example is roughly 26 seconds faster than a single transaction with 10,550 sets of values. This just does not make transactional sense. Is there something I can be doing better while utilizing core, or am I over-expecting how sqlalchemy to perform with datasets like this? Hopefully this is enough detail, but I can provide more upon request. Thank you! -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.