[sqlalchemy] Re: relative insert performance

2007-07-11 Thread Ants Aasma

On Jul 11, 9:02 pm, Justus Pendleton [EMAIL PROTECTED] wrote:
 I've noticed some large (10x) performance differences between sqlalchemy (no
 ORM) and plain DB API when using sqlite and was wondering if that is
 something expected even when I'm (trying to) avoid the ORM stuff or if I'm
 just doing something wrong.

Looking at 
http://www.sqlalchemy.org/trac/attachment/wiki/ProfilingResults/sqla.insert.prof.png
this is in the ballpark of whats expected. You can about double your
performance by compiling the insert and using the result instead of
the insert clause. And if that is not enough you can buffer insertable
data into a list and execute many rows at a time, this should
dramatically decrease SQLAlchemy overhead.

Ants


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: relative insert performance

2007-07-11 Thread Michael Bayer


On Jul 11, 2007, at 2:02 PM, Justus Pendleton wrote:


 I've noticed some large (10x) performance differences between  
 sqlalchemy (no
 ORM) and plain DB API when using sqlite and was wondering if that is
 something expected even when I'm (trying to) avoid the ORM stuff or  
 if I'm
 just doing something wrong.

 We have lots of CSV files and I wanted to turn each file into a  
 sqlite db to
 make another project easier. I wrote an importer using the raw sqlite3
 dbapi and then again using sqlalchemy (no ORM I hope). All the  
 importer
 does is loop over the lines of the file and insert rows into the  
 database.
 The raw dbapi version can insert ~5000-7000 rows per second whereas  
 the
 sqlalchemy version does about ~300-500 rows per second.

 When I turned on metadata.engine.echo I didn't see sqlalchemy doing  
 anything
 beyond what I expected it to; just a bunch of inserts followed by a  
 commit
 when I tell it to.

 Is this performance disparity expected?

you should definitely call it via executemany() semantics, where you  
issue just a single execute() call (or a small set of them), each one  
containing a list of parameter sets to be inserted.  the form is the  
last example here:  http://www.sqlalchemy.org/docs/ 
sqlconstruction.html#sql_insert . using this method, all of the  
overhead of compiling and executing within SQLAlchemy is cut down to  
a negligble amount and the work is placed almost totally within the  
DBAPI.  you also optimize the operation on the database side, as  
various databases can pre-prepare a statement handle that is used  
repeatedly.

the performance overhead by calling individual executes() is that of  
compiling the statement into a string each time, creating an  
execution context,  doing some guesswork on the given bind parameters  
and checking for defaults, and then executing.  a lot of automatic  
stuff is done for you.  like ants said, you can reduce the compile  
overhead by compiling the statement ahead of time:

statement = table.insert().compile()



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---