Hi all,
another developer and me are working to use SQLObject into our
project (IMDbPY [1]) to obtain db-independence.

We need to insert _a lot_ of rows (about 30mln and counting, after
a lot of data processing); in some tables we also have to set
the "id" column, while in other tables we don't need to, but that
doesn't really change the problem.

Our problem: poor performances at insert time.
This is especially true with database that supports "multiple data
insert" (i.e.: the executemany() method doesn't need to call
execute() for every single set of data, but it can build a
single SQL statement).

[first try]
  statementList = []
  for everySetOfData:
      insertObj = sqlbuilder.Insert(Table.sqlmeta.table, values={...})
      statementList.append(connection.sqlrepr(insertObj))
  while statementList:
      connection.query(statementList.pop())

Problem: executing one query for every single set of data is really
slow.


[second try]
same as the described above, but the various sets of data are stored
in a list of dictionary, and at the end this list is passed
to the sqlbuilder.Insert() class using the valueList argument.

Problem: connection.sqlrepr(sqlbuilder.Insert(..., valueList=[{...}, {...}]))
will return a SQL statement like:
   "INSERT INTO test (col1, col2) VALUES ('a1', 'b1'), ('a2', 'b2')"

even if "connection" represents a PostgreSQL database connection;
this seems wrong, because PostgreSQL doesn't support this SQL syntax.


[third try]
connection = connectionForURI(uri)
dbObject = connection.getConnection()
cursor = dbObject.cursor()

and then use cursor.executemany() over lists of dictionary containing
the data.

Problem: in this way SQLObject is used only to abstract the connection
and tables/columns names; moreover the parameter's style of the
executemany() method is not guaranteed to be db-independent (see
PEP 249 http://www.python.org/peps/pep-0249.html)


Have you any hints about how we can handle such a huge set of data,
with good performances?



+++
[1] http://imdbpy.sourceforge.net/
-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
sqlobject-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss

Reply via email to