sqlite3 performance problems only in python
hello, until now I used only small / simple databases in Python with sqlite3. Now I've a large and rather complex database. The most simple query (with just a result of 100 rows), takes about 70 seconds. And all that time is consumed in cursor.fetchall Using the same database in Delphi, using the same query, takes less than 5 seconds (including displaying the full table in a grid). Are there in Python faster ways to get the query results ? Would it be faster if I used an ODBC coupling and PyODBC to interface the database ? thanks, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
until now I used only small / simple databases in Python with sqlite3. Now I've a large and rather complex database. The most simple query (with just a result of 100 rows), takes about 70 seconds. And all that time is consumed in cursor.fetchall Using the same database in Delphi, using the same query, takes less than 5 seconds (including displaying the full table in a grid). While it may seem obvious, are you doing anything time-consuming with those results? Or have you tested just doing the fetchall() without doing any further processing? I'm curious on the timing of sql = ... start = time() cursor.execute(sql) rows = cursor.fetchall() end = time() print end-start with no other processing. I regularly write sql that's fairly complex and brings back somewhat large datasets (sometimes in sqlite), and have never experienced problems with simple quer[ies] (with just a result of 100 rows taking such extrordinary times The answer from the above code will help determine whether it's the sqlite portion that's crazy (and might need some well-placed index statements; though if your Delphi code is fine, I suspect not), or if it's your application code that goes off into left field with the resulting data. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
Tim Chase wrote: until now I used only small / simple databases in Python with sqlite3. Now I've a large and rather complex database. The most simple query (with just a result of 100 rows), takes about 70 seconds. And all that time is consumed in cursor.fetchall Using the same database in Delphi, using the same query, takes less than 5 seconds (including displaying the full table in a grid). While it may seem obvious, are you doing anything time-consuming with those results? Or have you tested just doing the fetchall() without doing any further processing? I'm curious on the timing of sql = ... start = time() cursor.execute(sql) rows = cursor.fetchall() end = time() print end-start No this is exactly what I did, I timed the execute and fetchall seperatly: execute: 125 msec fetchall: 71000 msec (returning 100 rows and 25 columns) pysqlite: version 2.3.2 btw, I don't know if it's of any importance, the SQL-statement I perform is select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* from OPNAMEN inner join POID_VLID on OPNAMEN.POID= POID_VLID.POID inner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID inner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID inner join POID_SSID_SCID on ( OPNAMEN.POID= POID_SSID_SCID.POID ) and ( SUBSCHAAL_GEGEVENS.SSID = POID_SSID_SCID.SSID ) inner join SCORES on SCORES.SCID = POID_SSID_SCID.SCID inner join PID_POID on OPNAMEN.POID= PID_POID.POID inner join PATIENTon PATIENT.PID = PID_POID.PID where substr ( lower( NAME) , 1, 6) = 'cis20r' and lower ( NAME_ ) = 'fatigue' and TEST_COUNT in (3,4) and DATETIME 39814.0 and SCORE 30 cheers, Stef with no other processing. I regularly write sql that's fairly complex and brings back somewhat large datasets (sometimes in sqlite), and have never experienced problems with simple quer[ies] (with just a result of 100 rows taking such extrordinary times The answer from the above code will help determine whether it's the sqlite portion that's crazy (and might need some well-placed index statements; though if your Delphi code is fine, I suspect not), or if it's your application code that goes off into left field with the resulting data. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote: btw, I don't know if it's of any importance, the SQL-statement I perform is select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* from OPNAMEN inner join POID_VLID on OPNAMEN.POID = POID_VLID.POID inner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID inner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID inner join POID_SSID_SCID on ( OPNAMEN.POID = POID_SSID_SCID.POID ) and ( SUBSCHAAL_GEGEVENS.SSID = POID_SSID_SCID.SSID ) inner join SCORES on SCORES.SCID = POID_SSID_SCID.SCID inner join PID_POID on OPNAMEN.POID = PID_POID.POID inner join PATIENT on PATIENT.PID = PID_POID.PID where substr ( lower( NAME) , 1, 6) = 'cis20r' and lower ( NAME_ ) = 'fatigue' and TEST_COUNT in (3,4) and DATETIME 39814.0 and SCORE 30 Warning: I suck at SQL and hate it with a passion... By using lower() on the left side of the where expressions I believe that you are table scanning. So it is not the size of the data returned, but the size of the data that needs to be scanned. -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
Stef Mientki stef.mien...@gmail.com (SM) wrote: SM btw, I don't know if it's of any importance, the SQL-statement I perform is SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* SM from OPNAMEN SMinner join POID_VLID on OPNAMEN.POID= POID_VLID.POID SMinner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID SMinner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID SMinner join POID_SSID_SCID on ( OPNAMEN.POID= SM POID_SSID_SCID.POID ) and SM ( SUBSCHAAL_GEGEVENS.SSID = SM POID_SSID_SCID.SSID ) SMinner join SCORES on SCORES.SCID = SM POID_SSID_SCID.SCID SMinner join PID_POID on OPNAMEN.POID= PID_POID.POID SMinner join PATIENTon PATIENT.PID = PID_POID.PID SM where substr ( lower( NAME) , 1, 6) = 'cis20r' SMand lower ( NAME_ ) = 'fatigue' SMand TEST_COUNT in (3,4) SMand DATETIME 39814.0 SMand SCORE 30 1) Do you have indices on the join fields? 2) Look at the ANALYZE command 3) Look at the EXPLAIN command -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
Piet van Oostrum wrote: Stef Mientki stef.mien...@gmail.com (SM) wrote: SM btw, I don't know if it's of any importance, the SQL-statement I perform is SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* SM from OPNAMEN SMinner join POID_VLID on OPNAMEN.POID= POID_VLID.POID SMinner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID SMinner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID SMinner join POID_SSID_SCID on ( OPNAMEN.POID= SM POID_SSID_SCID.POID ) and SM ( SUBSCHAAL_GEGEVENS.SSID = SM POID_SSID_SCID.SSID ) SMinner join SCORES on SCORES.SCID = SM POID_SSID_SCID.SCID SMinner join PID_POID on OPNAMEN.POID= PID_POID.POID SMinner join PATIENTon PATIENT.PID = PID_POID.PID SM where substr ( lower( NAME) , 1, 6) = 'cis20r' SMand lower ( NAME_ ) = 'fatigue' SMand TEST_COUNT in (3,4) SMand DATETIME 39814.0 SMand SCORE 30 1) Do you have indices on the join fields? well I'm happily surprised, you came up with this suggestion - I thought that sqlite created indexes on all primairy key and unique fields - but after explicitly creating the indices, a gained a speed of about a factor 10 After checking the database creation, it seemed I forgot to make these fields the primary key so thanks very much. I gained another factor of 10 speed by updating to version 2.5.5 of pysqlite. cheers, Stef 2) Look at the ANALYZE command 3) Look at the EXPLAIN command -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
David Stanek wrote: On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote: btw, I don't know if it's of any importance, the SQL-statement I perform is select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* from OPNAMEN inner join POID_VLID on OPNAMEN.POID= POID_VLID.POID inner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID inner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID inner join POID_SSID_SCID on ( OPNAMEN.POID= POID_SSID_SCID.POID ) and ( SUBSCHAAL_GEGEVENS.SSID = POID_SSID_SCID.SSID ) inner join SCORES on SCORES.SCID = POID_SSID_SCID.SCID inner join PID_POID on OPNAMEN.POID= PID_POID.POID inner join PATIENTon PATIENT.PID = PID_POID.PID where substr ( lower( NAME) , 1, 6) = 'cis20r' and lower ( NAME_ ) = 'fatigue' and TEST_COUNT in (3,4) and DATETIME 39814.0 and SCORE 30 Warning: I suck at SQL and hate it with a passion... +1, but I couldn't find anything better. I guess you have some better suggestions ? (I looked at Dee, but from my first view, it looks a bit premature) thanks, Stef -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
David Stanek dsta...@dstanek.com wrote: On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote: btw, I don't know if it's of any importance, the SQL-statement I perform is select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* from OPNAMEN inner join POID_VLID on OPNAMEN.POID = POID_VLID.POID inner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID inner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID inner join POID_SSID_SCID on ( OPNAMEN.POID = POID_SSID_SCID.POID ) and ( SUBSCHAAL_GEGEVENS.SSID = POID_SSID_SCID.SSID ) inner join SCORES on SCORES.SCID = POID_SSID_SCID.SCID inner join PID_POID on OPNAMEN.POID = PID_POID.POID inner join PATIENT on PATIENT.PID = PID_POID.PID where substr ( lower( NAME) , 1, 6) = 'cis20r' and lower ( NAME_ ) = 'fatigue' and TEST_COUNT in (3,4) and DATETIME 39814.0 and SCORE 30 Warning: I suck at SQL and hate it with a passion... By using lower() on the left side of the where expressions I believe that you are table scanning. So it is not the size of the data returned, but the size of the data that needs to be scanned. In all the databases I've used, the like operator has been case insensitive, so if that is the problem you could use NAME like '%cis20r%' -- not quite the same, but close! and NAME_ like 'fatigue' instead which might be quicker. Or not ;-) -- Nick Craig-Wood n...@craig-wood.com -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
On Thu, Jul 23, 2009 at 4:29 PM, Nick Craig-Woodn...@craig-wood.com wrote: In all the databases I've used, the like operator has been case insensitive, so if that is the problem you could use This is not true in all databases! Many times, this is something that is configurable when setting up the database server. I'm not sure what the defaults are, but here at work we have DB/2 and Oracle servers that are case sensitive, and at least one MSSQL server that is not. -- Jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
On Jul 23, 3:58 pm, Stef Mientki stef.mien...@gmail.com wrote: Piet van Oostrum wrote: Stef Mientki stef.mien...@gmail.com (SM) wrote: SM btw, I don't know if it's of any importance, the SQL-statement I perform is SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* SM from OPNAMEN SM inner join POID_VLID on OPNAMEN.POID = POID_VLID.POID SM inner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID SM inner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID SM inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID SM inner join POID_SSID_SCID on ( OPNAMEN.POID = SM POID_SSID_SCID.POID ) and SM ( SUBSCHAAL_GEGEVENS.SSID = SM POID_SSID_SCID.SSID ) SM inner join SCORES on SCORES.SCID = SM POID_SSID_SCID.SCID SM inner join PID_POID on OPNAMEN.POID = PID_POID.POID SM inner join PATIENT on PATIENT.PID = PID_POID.PID SM where substr ( lower( NAME) , 1, 6) = 'cis20r' SM and lower ( NAME_ ) = 'fatigue' SM and TEST_COUNT in (3,4) SM and DATETIME 39814.0 SM and SCORE 30 1) Do you have indices on the join fields? well I'm happily surprised, you came up with this suggestion - I thought that sqlite created indexes on all primairy key and unique fields - but after explicitly creating the indices, a gained a speed of about a factor 10 After checking the database creation, it seemed I forgot to make these fields the primary key so thanks very much. I gained another factor of 10 speed by updating to version 2.5.5 of pysqlite. cheers, Stef 2) Look at the ANALYZE command 3) Look at the EXPLAIN command You might want to consult the SQLite list for questions like this. Why do you use pysqlite? I just import sqlite3 in Python 2.5. What is the advantage of pysqlite? Che -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
Che M wrote: On Jul 23, 3:58 pm, Stef Mientki stef.mien...@gmail.com wrote: Piet van Oostrum wrote: Stef Mientki stef.mien...@gmail.com (SM) wrote: SM btw, I don't know if it's of any importance, the SQL-statement I perform is SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.* SM from OPNAMEN SMinner join POID_VLID on OPNAMEN.POID= POID_VLID.POID SMinner join VRAAGLST on VRAAGLST.VLID = POID_VLID.VLID SMinner join VLID_SSID on VRAAGLST.VLID = VLID_SSID.VLID SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID SMinner join POID_SSID_SCID on ( OPNAMEN.POID= SM POID_SSID_SCID.POID ) and SM ( SUBSCHAAL_GEGEVENS.SSID = SM POID_SSID_SCID.SSID ) SMinner join SCORES on SCORES.SCID = SM POID_SSID_SCID.SCID SMinner join PID_POID on OPNAMEN.POID= PID_POID.POID SMinner join PATIENTon PATIENT.PID = PID_POID.PID SM where substr ( lower( NAME) , 1, 6) = 'cis20r' SMand lower ( NAME_ ) = 'fatigue' SMand TEST_COUNT in (3,4) SMand DATETIME 39814.0 SMand SCORE 30 1) Do you have indices on the join fields? well I'm happily surprised, you came up with this suggestion - I thought that sqlite created indexes on all primairy key and unique fields - but after explicitly creating the indices, a gained a speed of about a factor 10 After checking the database creation, it seemed I forgot to make these fields the primary key so thanks very much. I gained another factor of 10 speed by updating to version 2.5.5 of pysqlite. cheers, Stef 2) Look at the ANALYZE command 3) Look at the EXPLAIN command You might want to consult the SQLite list for questions like this. thanks, but because the same SQL-statement in Delphi performed well, I thought it was a problem with the Python implementation. Why do you use pysqlite? I just import sqlite3 in Python 2.5. What is the advantage of pysqlite? it's 10 .. 15 times faster then sqlite3 delivered with pyton 2.5. AFAIK it's nothing different, just a newer version. cheers, Stef Che -- http://mail.python.org/mailman/listinfo/python-list
Re: sqlite3 performance problems only in python
On Thu, Jul 23, 2009 at 6:29 PM, Stef Mientkistef.mien...@gmail.com wrote: but because the same SQL-statement in Delphi performed well, I thought it was a problem with the Python implementation. Same SQL, but were you also using Sqlite in Delphi? -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek -- http://mail.python.org/mailman/listinfo/python-list