I recently ran into an issue today where batched (inside a
transaction) I was able to achieve not more than about 9000
inserts/second to a postgresql database (same machine running the
test).

With everything exactly the same, I was able to achieve over 50,000
inserts/s to sqlite.

Now, I won't argue the relative merits of each database, but this is a
big problem for postgresql. I believe I have determined that the
psycopg2 module is to blame, and the substantial portion of the time
spent was being spent in IPC/RPC. Basically, every single insert in
this test is identical except for the values (same table and columns),
but psycopg2 (or possibly SQLAlchemy) was performing an individual
INSERT for every single row. I was *not* using the ORM.

The code was something like this:


row_values = build_a_bunch_of_dictionaries()
ins = table.insert()
t = conn.begin()
conn.execute(ins, row_values)
t.commit()

where row_values is (of course) a list of dictionaries.

What can be done here to improve the speed of bulk inserts? For
postgresql to get walloped by a factor of 5 in this area is a big
bummer.

-- 
Jon

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to