[web2py] Re: Database performance

2011-08-26 Thread Massimo Di Pierro
Good to know.

The difference between
 {{rows = dataSet.select()}}
and
 {{rows = db.executesql(dataSet._select())}}
is that the former calls the function parse which loops over the
175K rows and converts them into objects.

On Aug 26, 12:08 am, HughBarker hbar...@gmail.com wrote:
 Thanks for the reply.

 {{rows = db.executesql(dataSet._select())}} executes in 0.16s.

 On Aug 26, 2:09 pm, Massimo Di Pierro massimo.dipie...@gmail.com
 wrote:







  Try by replacing

   {{rows = dataSet.select()}}

  with

   {{rows = db.executesql(dataSet._select())}}

  and tell me what get.

  Anyway, your raw sql select is only feting a third of the rows that
  that web2py is fetching.
  web2py is also doing a lot of extra work converting the list of
  records from tuples to dictionaries.

  On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote:

   Hi,
   I'm new to web2py - first of all, thanks for such an excellent
   framework. It's certainly a breath of fresh air after writing J2EE
   webapps.

   Anyway, I'm building an application, part of which plots large
   datasets (using the JS Highstocks library). I've defined a table like
   this:

   db.define_table('Underway_Data',
       Field('UTCtime', 'datetime'),
       Field('waterTemp', 'double'),
       Field('waterTempQC', 'integer'),
       Field('latitude', 'double'),
       Field('latitudeQC', 'integer'),
       Field('longitude', 'double'),
       Field('longitudeQC', 'integer'))

   and populated it with data, about 175k rows (postgres on ubuntu,
   web2py is residing on the same server and setup using the 'one step
   production deployment' script)

   Running the database query:

         {{resolution=5}}
         {{query = (db.Underway_Data.id%resolution==0)}}
         {{dataSet = db(query)}}
         {{tic = time.clock()}}
         {{rows = dataSet.select()}}
         {{toc = time.clock()}}

   will take about 4.5 seconds (the resolution variable subsamples the
   data for performance reasons).

   While it is a fair amount of data to be pushing around, 4.5 seconds
   seems slow to me. I wrote a simple benchmark in pure python:

   import psycopg2, time
   tic = time.clock()
   conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres
   password= dbname=geospatial)
   cur = conn.cursor()
   cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime,
   Underway_Data.waterTemp, Underway_Data.waterTempQC,
   Underway_Data.latitude, Underway_Data.latitudeQC,
   Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data
   WHERE ((Underway_Data.id % 5) = 0);)
   rows = cur.fetchall()
   toc = time.clock()
   print toc-tic
   cur.close()
   conn.close()

   which runs in ~0.13 seconds.

   (the query I use is the output of db(query)._select() )

   Am I missing something here? Is there anything I can do to increase
   performance?

   Regards,
   Hugh.


[web2py] Re: Database performance

2011-08-25 Thread Massimo Di Pierro
Try by replacing

 {{rows = dataSet.select()}}

with

 {{rows = db.executesql(dataSet._select())}}

and tell me what get.

Anyway, your raw sql select is only feting a third of the rows that
that web2py is fetching.
web2py is also doing a lot of extra work converting the list of
records from tuples to dictionaries.

On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote:
 Hi,
 I'm new to web2py - first of all, thanks for such an excellent
 framework. It's certainly a breath of fresh air after writing J2EE
 webapps.

 Anyway, I'm building an application, part of which plots large
 datasets (using the JS Highstocks library). I've defined a table like
 this:

 db.define_table('Underway_Data',
     Field('UTCtime', 'datetime'),
     Field('waterTemp', 'double'),
     Field('waterTempQC', 'integer'),
     Field('latitude', 'double'),
     Field('latitudeQC', 'integer'),
     Field('longitude', 'double'),
     Field('longitudeQC', 'integer'))

 and populated it with data, about 175k rows (postgres on ubuntu,
 web2py is residing on the same server and setup using the 'one step
 production deployment' script)

 Running the database query:

       {{resolution=5}}
       {{query = (db.Underway_Data.id%resolution==0)}}
       {{dataSet = db(query)}}
       {{tic = time.clock()}}
       {{rows = dataSet.select()}}
       {{toc = time.clock()}}

 will take about 4.5 seconds (the resolution variable subsamples the
 data for performance reasons).

 While it is a fair amount of data to be pushing around, 4.5 seconds
 seems slow to me. I wrote a simple benchmark in pure python:

 import psycopg2, time
 tic = time.clock()
 conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres
 password= dbname=geospatial)
 cur = conn.cursor()
 cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime,
 Underway_Data.waterTemp, Underway_Data.waterTempQC,
 Underway_Data.latitude, Underway_Data.latitudeQC,
 Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data
 WHERE ((Underway_Data.id % 5) = 0);)
 rows = cur.fetchall()
 toc = time.clock()
 print toc-tic
 cur.close()
 conn.close()

 which runs in ~0.13 seconds.

 (the query I use is the output of db(query)._select() )

 Am I missing something here? Is there anything I can do to increase
 performance?

 Regards,
 Hugh.


[web2py] Re: Database performance

2011-08-25 Thread HughBarker
Thanks for the reply.

{{rows = db.executesql(dataSet._select())}} executes in 0.16s.

On Aug 26, 2:09 pm, Massimo Di Pierro massimo.dipie...@gmail.com
wrote:
 Try by replacing

  {{rows = dataSet.select()}}

 with

  {{rows = db.executesql(dataSet._select())}}

 and tell me what get.

 Anyway, your raw sql select is only feting a third of the rows that
 that web2py is fetching.
 web2py is also doing a lot of extra work converting the list of
 records from tuples to dictionaries.

 On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote:







  Hi,
  I'm new to web2py - first of all, thanks for such an excellent
  framework. It's certainly a breath of fresh air after writing J2EE
  webapps.

  Anyway, I'm building an application, part of which plots large
  datasets (using the JS Highstocks library). I've defined a table like
  this:

  db.define_table('Underway_Data',
      Field('UTCtime', 'datetime'),
      Field('waterTemp', 'double'),
      Field('waterTempQC', 'integer'),
      Field('latitude', 'double'),
      Field('latitudeQC', 'integer'),
      Field('longitude', 'double'),
      Field('longitudeQC', 'integer'))

  and populated it with data, about 175k rows (postgres on ubuntu,
  web2py is residing on the same server and setup using the 'one step
  production deployment' script)

  Running the database query:

        {{resolution=5}}
        {{query = (db.Underway_Data.id%resolution==0)}}
        {{dataSet = db(query)}}
        {{tic = time.clock()}}
        {{rows = dataSet.select()}}
        {{toc = time.clock()}}

  will take about 4.5 seconds (the resolution variable subsamples the
  data for performance reasons).

  While it is a fair amount of data to be pushing around, 4.5 seconds
  seems slow to me. I wrote a simple benchmark in pure python:

  import psycopg2, time
  tic = time.clock()
  conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres
  password= dbname=geospatial)
  cur = conn.cursor()
  cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime,
  Underway_Data.waterTemp, Underway_Data.waterTempQC,
  Underway_Data.latitude, Underway_Data.latitudeQC,
  Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data
  WHERE ((Underway_Data.id % 5) = 0);)
  rows = cur.fetchall()
  toc = time.clock()
  print toc-tic
  cur.close()
  conn.close()

  which runs in ~0.13 seconds.

  (the query I use is the output of db(query)._select() )

  Am I missing something here? Is there anything I can do to increase
  performance?

  Regards,
  Hugh.