sqlite3 performance problems only in python

2009-07-23 Thread Stef Mientki

hello,

until now I used only small / simple databases in Python with sqlite3.
Now I've a large and rather complex database.

The most simple query (with just a result of 100 rows),
takes about 70 seconds.
And all that time is consumed in cursor.fetchall

Using the same database in Delphi,
using the same query,
takes less than 5 seconds (including displaying the full table in a grid).

Are there in Python faster ways to get the query results ?
Would it be faster if I used an ODBC coupling and PyODBC to interface 
the database ?


thanks,
Stef Mientki
--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Tim Chase

until now I used only small / simple databases in Python with sqlite3.
Now I've a large and rather complex database.

The most simple query (with just a result of 100 rows),
takes about 70 seconds.
And all that time is consumed in cursor.fetchall

Using the same database in Delphi,
using the same query,
takes less than 5 seconds (including displaying the full table in a grid).


While it may seem obvious, are you doing anything time-consuming 
with those results?  Or have you tested just doing the fetchall() 
without doing any further processing?  I'm curious on the timing of


  sql = ...
  start = time()
  cursor.execute(sql)
  rows = cursor.fetchall()
  end = time()
  print end-start

with no other processing.  I regularly write sql that's fairly 
complex and brings back somewhat large datasets (sometimes in 
sqlite), and have never experienced problems with simple 
quer[ies] (with just a result of 100 rows taking such 
extrordinary times


The answer from the above code will help determine whether it's 
the sqlite portion that's crazy (and might need some well-placed 
index statements; though if your Delphi code is fine, I suspect 
not), or if it's your application code that goes off into left 
field with the resulting data.


-tkc




--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Stef Mientki

Tim Chase wrote:

until now I used only small / simple databases in Python with sqlite3.
Now I've a large and rather complex database.

The most simple query (with just a result of 100 rows),
takes about 70 seconds.
And all that time is consumed in cursor.fetchall

Using the same database in Delphi,
using the same query,
takes less than 5 seconds (including displaying the full table in a 
grid).


While it may seem obvious, are you doing anything time-consuming with 
those results?  Or have you tested just doing the fetchall() without 
doing any further processing?  I'm curious on the timing of


  sql = ...
  start = time()
  cursor.execute(sql)
  rows = cursor.fetchall()
  end = time()
  print end-start

No this is exactly what I did,
I timed the execute and fetchall seperatly:
execute: 125 msec
fetchall: 71000 msec  (returning 100 rows and 25 columns)
pysqlite:  version 2.3.2

btw, I don't know if it's of any importance, the SQL-statement I perform is
select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
 from OPNAMEN
   inner join POID_VLID  on OPNAMEN.POID= 
POID_VLID.POID
   inner join VRAAGLST   on VRAAGLST.VLID   = 
POID_VLID.VLID
   inner join VLID_SSID  on VRAAGLST.VLID   = 
VLID_SSID.VLID
   inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = 
VLID_SSID.SSID
   inner join POID_SSID_SCID on ( OPNAMEN.POID= 
POID_SSID_SCID.POID ) and
( SUBSCHAAL_GEGEVENS.SSID = 
POID_SSID_SCID.SSID )
   inner join SCORES on SCORES.SCID = 
POID_SSID_SCID.SCID

   inner join PID_POID   on OPNAMEN.POID= PID_POID.POID
   inner join PATIENTon PATIENT.PID = PID_POID.PID
 where substr ( lower( NAME) , 1, 6)  = 'cis20r'
   and lower ( NAME_ ) = 'fatigue'
   and TEST_COUNT in (3,4)
   and DATETIME  39814.0
   and SCORE  30

cheers,
Stef



with no other processing.  I regularly write sql that's fairly complex 
and brings back somewhat large datasets (sometimes in sqlite), and 
have never experienced problems with simple quer[ies] (with just a 
result of 100 rows taking such extrordinary times


The answer from the above code will help determine whether it's the 
sqlite portion that's crazy (and might need some well-placed index 
statements; though if your Delphi code is fine, I suspect not), or if 
it's your application code that goes off into left field with the 
resulting data.


-tkc






--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread David Stanek
On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote:

 btw, I don't know if it's of any importance, the SQL-statement I perform is
 select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
  from OPNAMEN
   inner join POID_VLID          on OPNAMEN.POID            = POID_VLID.POID
   inner join VRAAGLST           on VRAAGLST.VLID           = POID_VLID.VLID
   inner join VLID_SSID          on VRAAGLST.VLID           = VLID_SSID.VLID
   inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID
   inner join POID_SSID_SCID     on ( OPNAMEN.POID            =
 POID_SSID_SCID.POID ) and
                                    ( SUBSCHAAL_GEGEVENS.SSID =
 POID_SSID_SCID.SSID )
   inner join SCORES             on SCORES.SCID             =
 POID_SSID_SCID.SCID
   inner join PID_POID           on OPNAMEN.POID            = PID_POID.POID
   inner join PATIENT            on PATIENT.PID             = PID_POID.PID
  where substr ( lower( NAME) , 1, 6)  = 'cis20r'
   and lower ( NAME_ ) = 'fatigue'
   and TEST_COUNT in (3,4)
   and DATETIME  39814.0
   and SCORE  30

Warning: I suck at SQL and hate it with a passion...

By using lower() on the left side of the where expressions I believe
that you are table scanning. So it is not the size of the data
returned, but the size of the data that needs to be scanned.

-- 
David
blog: http://www.traceback.org
twitter: http://twitter.com/dstanek
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Piet van Oostrum
 Stef Mientki stef.mien...@gmail.com (SM) wrote:

SM btw, I don't know if it's of any importance, the SQL-statement I perform is
SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
SM  from OPNAMEN
SMinner join POID_VLID  on OPNAMEN.POID= 
POID_VLID.POID
SMinner join VRAAGLST   on VRAAGLST.VLID   = 
POID_VLID.VLID
SMinner join VLID_SSID  on VRAAGLST.VLID   = 
VLID_SSID.VLID
SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = 
VLID_SSID.SSID
SMinner join POID_SSID_SCID on ( OPNAMEN.POID=
SM POID_SSID_SCID.POID ) and
SM ( SUBSCHAAL_GEGEVENS.SSID =
SM POID_SSID_SCID.SSID )
SMinner join SCORES on SCORES.SCID =
SM POID_SSID_SCID.SCID
SMinner join PID_POID   on OPNAMEN.POID= PID_POID.POID
SMinner join PATIENTon PATIENT.PID = PID_POID.PID
SM  where substr ( lower( NAME) , 1, 6)  = 'cis20r'
SMand lower ( NAME_ ) = 'fatigue'
SMand TEST_COUNT in (3,4)
SMand DATETIME  39814.0
SMand SCORE  30

1) Do you have indices on the join fields?
2) Look at the ANALYZE command
3) Look at the EXPLAIN command
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Stef Mientki

Piet van Oostrum wrote:

Stef Mientki stef.mien...@gmail.com (SM) wrote:



  

SM btw, I don't know if it's of any importance, the SQL-statement I perform is
SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
SM  from OPNAMEN
SMinner join POID_VLID  on OPNAMEN.POID= POID_VLID.POID
SMinner join VRAAGLST   on VRAAGLST.VLID   = POID_VLID.VLID
SMinner join VLID_SSID  on VRAAGLST.VLID   = VLID_SSID.VLID
SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID
SMinner join POID_SSID_SCID on ( OPNAMEN.POID=
SM POID_SSID_SCID.POID ) and
SM ( SUBSCHAAL_GEGEVENS.SSID =
SM POID_SSID_SCID.SSID )
SMinner join SCORES on SCORES.SCID =
SM POID_SSID_SCID.SCID
SMinner join PID_POID   on OPNAMEN.POID= PID_POID.POID
SMinner join PATIENTon PATIENT.PID = PID_POID.PID
SM  where substr ( lower( NAME) , 1, 6)  = 'cis20r'
SMand lower ( NAME_ ) = 'fatigue'
SMand TEST_COUNT in (3,4)
SMand DATETIME  39814.0
SMand SCORE  30



1) Do you have indices on the join fields?
  

well I'm happily surprised, you came up with this suggestion
- I thought that sqlite created indexes on all primairy key and unique 
fields
- but after explicitly creating the indices, a gained a speed of about a 
factor 10
After checking the database creation, it seemed I forgot to make these 
fields the primary key

so thanks very much.

I gained another factor of 10 speed by updating to version 2.5.5 of 
pysqlite.


cheers,
Stef

2) Look at the ANALYZE command
3) Look at the EXPLAIN command
  


--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Stef Mientki

David Stanek wrote:

On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote:
  

btw, I don't know if it's of any importance, the SQL-statement I perform is
select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
 from OPNAMEN
  inner join POID_VLID  on OPNAMEN.POID= POID_VLID.POID
  inner join VRAAGLST   on VRAAGLST.VLID   = POID_VLID.VLID
  inner join VLID_SSID  on VRAAGLST.VLID   = VLID_SSID.VLID
  inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID
  inner join POID_SSID_SCID on ( OPNAMEN.POID=
POID_SSID_SCID.POID ) and
   ( SUBSCHAAL_GEGEVENS.SSID =
POID_SSID_SCID.SSID )
  inner join SCORES on SCORES.SCID =
POID_SSID_SCID.SCID
  inner join PID_POID   on OPNAMEN.POID= PID_POID.POID
  inner join PATIENTon PATIENT.PID = PID_POID.PID
 where substr ( lower( NAME) , 1, 6)  = 'cis20r'
  and lower ( NAME_ ) = 'fatigue'
  and TEST_COUNT in (3,4)
  and DATETIME  39814.0
  and SCORE  30



Warning: I suck at SQL and hate it with a passion...
  

+1,
but I couldn't find anything better.
I guess you have some better suggestions ?
(I looked at Dee, but from my first view, it looks a bit premature)

thanks,
Stef

--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Nick Craig-Wood
David Stanek dsta...@dstanek.com wrote:
  On Thu, Jul 23, 2009 at 9:02 AM, Stef Mientkistef.mien...@gmail.com wrote:
 
  btw, I don't know if it's of any importance, the SQL-statement I perform is
  select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
   from OPNAMEN
    inner join POID_VLID          on OPNAMEN.POID            = POID_VLID.POID
    inner join VRAAGLST           on VRAAGLST.VLID           = POID_VLID.VLID
    inner join VLID_SSID          on VRAAGLST.VLID           = VLID_SSID.VLID
    inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID
    inner join POID_SSID_SCID     on ( OPNAMEN.POID            =
  POID_SSID_SCID.POID ) and
                                     ( SUBSCHAAL_GEGEVENS.SSID =
  POID_SSID_SCID.SSID )
    inner join SCORES             on SCORES.SCID             =
  POID_SSID_SCID.SCID
    inner join PID_POID           on OPNAMEN.POID            = PID_POID.POID
    inner join PATIENT            on PATIENT.PID             = PID_POID.PID
   where substr ( lower( NAME) , 1, 6)  = 'cis20r'
    and lower ( NAME_ ) = 'fatigue'
    and TEST_COUNT in (3,4)
    and DATETIME  39814.0
    and SCORE  30
 
  Warning: I suck at SQL and hate it with a passion...
 
  By using lower() on the left side of the where expressions I believe
  that you are table scanning. So it is not the size of the data
  returned, but the size of the data that needs to be scanned.

In all the databases I've used, the like operator has been case
insensitive, so if that is the problem you could use

  NAME like '%cis20r%'  -- not quite the same, but close!
  and NAME_ like 'fatigue'

instead which might be quicker.  Or not ;-)

-- 
Nick Craig-Wood n...@craig-wood.com -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Jerry Hill
On Thu, Jul 23, 2009 at 4:29 PM, Nick Craig-Woodn...@craig-wood.com wrote:
 In all the databases I've used, the like operator has been case
 insensitive, so if that is the problem you could use

This is not true in all databases!  Many times, this is something that
is configurable when setting up the database server.  I'm not sure
what the defaults are, but here at work we have DB/2 and Oracle
servers that are case sensitive, and at least one MSSQL server that is
not.

-- 
Jerry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Che M
On Jul 23, 3:58 pm, Stef Mientki stef.mien...@gmail.com wrote:
 Piet van Oostrum wrote:
  Stef Mientki stef.mien...@gmail.com (SM) wrote:

  SM btw, I don't know if it's of any importance, the SQL-statement I 
  perform is
  SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
  SM  from OPNAMEN
  SM    inner join POID_VLID          on OPNAMEN.POID            = 
  POID_VLID.POID
  SM    inner join VRAAGLST           on VRAAGLST.VLID           = 
  POID_VLID.VLID
  SM    inner join VLID_SSID          on VRAAGLST.VLID           = 
  VLID_SSID.VLID
  SM    inner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = 
  VLID_SSID.SSID
  SM    inner join POID_SSID_SCID     on ( OPNAMEN.POID            =
  SM POID_SSID_SCID.POID ) and
  SM                                     ( SUBSCHAAL_GEGEVENS.SSID =
  SM POID_SSID_SCID.SSID )
  SM    inner join SCORES             on SCORES.SCID             =
  SM POID_SSID_SCID.SCID
  SM    inner join PID_POID           on OPNAMEN.POID            = 
  PID_POID.POID
  SM    inner join PATIENT            on PATIENT.PID             = 
  PID_POID.PID
  SM  where substr ( lower( NAME) , 1, 6)  = 'cis20r'
  SM    and lower ( NAME_ ) = 'fatigue'
  SM    and TEST_COUNT in (3,4)
  SM    and DATETIME  39814.0
  SM    and SCORE  30

  1) Do you have indices on the join fields?

 well I'm happily surprised, you came up with this suggestion
 - I thought that sqlite created indexes on all primairy key and unique
 fields
 - but after explicitly creating the indices, a gained a speed of about a
 factor 10
 After checking the database creation, it seemed I forgot to make these
 fields the primary key
 so thanks very much.

 I gained another factor of 10 speed by updating to version 2.5.5 of
 pysqlite.

 cheers,
 Stef

  2) Look at the ANALYZE command
  3) Look at the EXPLAIN command



You might want to consult the SQLite list for questions like this.

Why do you use pysqlite?  I just import sqlite3 in Python 2.5.
What is the advantage of pysqlite?

Che
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread Stef Mientki

Che M wrote:

On Jul 23, 3:58 pm, Stef Mientki stef.mien...@gmail.com wrote:
  

Piet van Oostrum wrote:


Stef Mientki stef.mien...@gmail.com (SM) wrote:


SM btw, I don't know if it's of any importance, the SQL-statement I perform is
SM select OPNAMEN.*, NAME, NAME_, SCORES.SCORE, PATIENT.*
SM  from OPNAMEN
SMinner join POID_VLID  on OPNAMEN.POID= POID_VLID.POID
SMinner join VRAAGLST   on VRAAGLST.VLID   = POID_VLID.VLID
SMinner join VLID_SSID  on VRAAGLST.VLID   = VLID_SSID.VLID
SMinner join SUBSCHAAL_GEGEVENS on SUBSCHAAL_GEGEVENS.SSID = VLID_SSID.SSID
SMinner join POID_SSID_SCID on ( OPNAMEN.POID=
SM POID_SSID_SCID.POID ) and
SM ( SUBSCHAAL_GEGEVENS.SSID =
SM POID_SSID_SCID.SSID )
SMinner join SCORES on SCORES.SCID =
SM POID_SSID_SCID.SCID
SMinner join PID_POID   on OPNAMEN.POID= PID_POID.POID
SMinner join PATIENTon PATIENT.PID = PID_POID.PID
SM  where substr ( lower( NAME) , 1, 6)  = 'cis20r'
SMand lower ( NAME_ ) = 'fatigue'
SMand TEST_COUNT in (3,4)
SMand DATETIME  39814.0
SMand SCORE  30


1) Do you have indices on the join fields?
  

well I'm happily surprised, you came up with this suggestion
- I thought that sqlite created indexes on all primairy key and unique
fields
- but after explicitly creating the indices, a gained a speed of about a
factor 10
After checking the database creation, it seemed I forgot to make these
fields the primary key
so thanks very much.

I gained another factor of 10 speed by updating to version 2.5.5 of
pysqlite.

cheers,
Stef



2) Look at the ANALYZE command
3) Look at the EXPLAIN command
  



You might want to consult the SQLite list for questions like this.
  

thanks,
but because the same SQL-statement in Delphi performed well,
I thought it was a problem with the Python implementation.

Why do you use pysqlite?  I just import sqlite3 in Python 2.5.
What is the advantage of pysqlite?
  

it's 10 .. 15 times faster then sqlite3 delivered with pyton 2.5.
AFAIK it's nothing different, just a newer version.

cheers,
Stef

Che
  


--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 performance problems only in python

2009-07-23 Thread David Stanek
On Thu, Jul 23, 2009 at 6:29 PM, Stef Mientkistef.mien...@gmail.com wrote:

 but because the same SQL-statement in Delphi performed well,
 I thought it was a problem with the Python implementation.

Same SQL, but were you also using Sqlite in Delphi?

-- 
David
blog: http://www.traceback.org
twitter: http://twitter.com/dstanek
-- 
http://mail.python.org/mailman/listinfo/python-list