Hi all,
another developer and me are working to use SQLObject into our
project (IMDbPY [1]) to obtain db-independence.
We need to insert _a lot_ of rows (about 30mln and counting, after
a lot of data processing); in some tables we also have to set
the "id" column, while in other tables we don't need to, but that
doesn't really change the problem.
Our problem: poor performances at insert time.
This is especially true with database that supports "multiple data
insert" (i.e.: the executemany() method doesn't need to call
execute() for every single set of data, but it can build a
single SQL statement).
[first try]
statementList = []
for everySetOfData:
insertObj = sqlbuilder.Insert(Table.sqlmeta.table, values={...})
statementList.append(connection.sqlrepr(insertObj))
while statementList:
connection.query(statementList.pop())
Problem: executing one query for every single set of data is really
slow.
[second try]
same as the described above, but the various sets of data are stored
in a list of dictionary, and at the end this list is passed
to the sqlbuilder.Insert() class using the valueList argument.
Problem: connection.sqlrepr(sqlbuilder.Insert(..., valueList=[{...}, {...}]))
will return a SQL statement like:
"INSERT INTO test (col1, col2) VALUES ('a1', 'b1'), ('a2', 'b2')"
even if "connection" represents a PostgreSQL database connection;
this seems wrong, because PostgreSQL doesn't support this SQL syntax.
[third try]
connection = connectionForURI(uri)
dbObject = connection.getConnection()
cursor = dbObject.cursor()
and then use cursor.executemany() over lists of dictionary containing
the data.
Problem: in this way SQLObject is used only to abstract the connection
and tables/columns names; moreover the parameter's style of the
executemany() method is not guaranteed to be db-independent (see
PEP 249 http://www.python.org/peps/pep-0249.html)
Have you any hints about how we can handle such a huge set of data,
with good performances?
+++
[1] http://imdbpy.sourceforge.net/
--
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
sqlobject-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss