On Tue, Jan 2, 2018 at 6:46 AM, Jevgenij Kusakovskij <kush...@gmail.com> wrote: > I would like to send a large pandas.DataFrame to a remote server running MS > SQL. I am using pandas-0.20.3, pyODBC-4.0.21 and sqlalchemy-1.1.13. > > My first attempt of tackling this problem can be reduced to following code: > > import sqlalchemy as sa > > > > engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % cnxn_str) > data_frame.to_sql(table_name, engine, index=False) > > > Simple, but very slow... Took about 2.5 minutes to insert 1000 rows.
that's really weird, 1000 rows is very few. I'm pretty sure if I ran 1000 rows over pyodbc into SQL server here it would take about 300 ms tops. 2.5 minutes is more like you're trying to send 800K rows. that alone is kind of concerning, and if pandas is not sending all the rows to connection.execute() at once and is instead running one row at a time, then the fast_executemany flag will have no effect for you. > > Using the following code, that does not involve SQLAlchemy, the same task is > performed in less than a second: > > import pyodbc as pdb > > list_of_tuples = convert_df(data_frame) > > connection = pdb.connect(cnxn_str) > > cursor = self.connection.cursor() > cursor.fast_executemany = True > cursor.executemany(sql_statement, list_of_tuples) > connection.commit() > > cursor.close() > connection.close() > > > Is there a way to flip the fast_executemany switch on when using SQLAlchemy? easiest would be to use cursor execution events: http://docs.sqlalchemy.org/en/latest/core/events.html?highlight=before_cursor_execute#sqlalchemy.events.ConnectionEvents.before_cursor_execute you get the cursor right there, set the flag. You can set a custom execution_option: conn = conn.execution_options(pyodbc_fast_execute=True) then in your event you can look for it: @event.listens_for(SomeEngine, 'before_cursor_execute') def receive_before_cursor_execute(conn, cursor, statement, parameters, context, executemany): if context.execution_options.get('pyodbc_fast_execute', False): cursor.fast_executemany = True but....2.5 minutes for 1000 rows is much more wrong than that, you should figure out what's happening there. > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.