[web2py] Re: Database performance
Good to know. The difference between {{rows = dataSet.select()}} and {{rows = db.executesql(dataSet._select())}} is that the former calls the function parse which loops over the 175K rows and converts them into objects. On Aug 26, 12:08 am, HughBarker hbar...@gmail.com wrote: Thanks for the reply. {{rows = db.executesql(dataSet._select())}} executes in 0.16s. On Aug 26, 2:09 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Try by replacing {{rows = dataSet.select()}} with {{rows = db.executesql(dataSet._select())}} and tell me what get. Anyway, your raw sql select is only feting a third of the rows that that web2py is fetching. web2py is also doing a lot of extra work converting the list of records from tuples to dictionaries. On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote: Hi, I'm new to web2py - first of all, thanks for such an excellent framework. It's certainly a breath of fresh air after writing J2EE webapps. Anyway, I'm building an application, part of which plots large datasets (using the JS Highstocks library). I've defined a table like this: db.define_table('Underway_Data', Field('UTCtime', 'datetime'), Field('waterTemp', 'double'), Field('waterTempQC', 'integer'), Field('latitude', 'double'), Field('latitudeQC', 'integer'), Field('longitude', 'double'), Field('longitudeQC', 'integer')) and populated it with data, about 175k rows (postgres on ubuntu, web2py is residing on the same server and setup using the 'one step production deployment' script) Running the database query: {{resolution=5}} {{query = (db.Underway_Data.id%resolution==0)}} {{dataSet = db(query)}} {{tic = time.clock()}} {{rows = dataSet.select()}} {{toc = time.clock()}} will take about 4.5 seconds (the resolution variable subsamples the data for performance reasons). While it is a fair amount of data to be pushing around, 4.5 seconds seems slow to me. I wrote a simple benchmark in pure python: import psycopg2, time tic = time.clock() conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres password= dbname=geospatial) cur = conn.cursor() cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime, Underway_Data.waterTemp, Underway_Data.waterTempQC, Underway_Data.latitude, Underway_Data.latitudeQC, Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data WHERE ((Underway_Data.id % 5) = 0);) rows = cur.fetchall() toc = time.clock() print toc-tic cur.close() conn.close() which runs in ~0.13 seconds. (the query I use is the output of db(query)._select() ) Am I missing something here? Is there anything I can do to increase performance? Regards, Hugh.
[web2py] Re: Database performance
Try by replacing {{rows = dataSet.select()}} with {{rows = db.executesql(dataSet._select())}} and tell me what get. Anyway, your raw sql select is only feting a third of the rows that that web2py is fetching. web2py is also doing a lot of extra work converting the list of records from tuples to dictionaries. On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote: Hi, I'm new to web2py - first of all, thanks for such an excellent framework. It's certainly a breath of fresh air after writing J2EE webapps. Anyway, I'm building an application, part of which plots large datasets (using the JS Highstocks library). I've defined a table like this: db.define_table('Underway_Data', Field('UTCtime', 'datetime'), Field('waterTemp', 'double'), Field('waterTempQC', 'integer'), Field('latitude', 'double'), Field('latitudeQC', 'integer'), Field('longitude', 'double'), Field('longitudeQC', 'integer')) and populated it with data, about 175k rows (postgres on ubuntu, web2py is residing on the same server and setup using the 'one step production deployment' script) Running the database query: {{resolution=5}} {{query = (db.Underway_Data.id%resolution==0)}} {{dataSet = db(query)}} {{tic = time.clock()}} {{rows = dataSet.select()}} {{toc = time.clock()}} will take about 4.5 seconds (the resolution variable subsamples the data for performance reasons). While it is a fair amount of data to be pushing around, 4.5 seconds seems slow to me. I wrote a simple benchmark in pure python: import psycopg2, time tic = time.clock() conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres password= dbname=geospatial) cur = conn.cursor() cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime, Underway_Data.waterTemp, Underway_Data.waterTempQC, Underway_Data.latitude, Underway_Data.latitudeQC, Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data WHERE ((Underway_Data.id % 5) = 0);) rows = cur.fetchall() toc = time.clock() print toc-tic cur.close() conn.close() which runs in ~0.13 seconds. (the query I use is the output of db(query)._select() ) Am I missing something here? Is there anything I can do to increase performance? Regards, Hugh.
[web2py] Re: Database performance
Thanks for the reply. {{rows = db.executesql(dataSet._select())}} executes in 0.16s. On Aug 26, 2:09 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Try by replacing {{rows = dataSet.select()}} with {{rows = db.executesql(dataSet._select())}} and tell me what get. Anyway, your raw sql select is only feting a third of the rows that that web2py is fetching. web2py is also doing a lot of extra work converting the list of records from tuples to dictionaries. On Aug 25, 9:48 pm, HughBarker hbar...@gmail.com wrote: Hi, I'm new to web2py - first of all, thanks for such an excellent framework. It's certainly a breath of fresh air after writing J2EE webapps. Anyway, I'm building an application, part of which plots large datasets (using the JS Highstocks library). I've defined a table like this: db.define_table('Underway_Data', Field('UTCtime', 'datetime'), Field('waterTemp', 'double'), Field('waterTempQC', 'integer'), Field('latitude', 'double'), Field('latitudeQC', 'integer'), Field('longitude', 'double'), Field('longitudeQC', 'integer')) and populated it with data, about 175k rows (postgres on ubuntu, web2py is residing on the same server and setup using the 'one step production deployment' script) Running the database query: {{resolution=5}} {{query = (db.Underway_Data.id%resolution==0)}} {{dataSet = db(query)}} {{tic = time.clock()}} {{rows = dataSet.select()}} {{toc = time.clock()}} will take about 4.5 seconds (the resolution variable subsamples the data for performance reasons). While it is a fair amount of data to be pushing around, 4.5 seconds seems slow to me. I wrote a simple benchmark in pure python: import psycopg2, time tic = time.clock() conn = psycopg2.connect(host=ubuntu-geospatial-server user=postgres password= dbname=geospatial) cur = conn.cursor() cur.execute(SELECT Underway_Data.id, Underway_Data.UTCtime, Underway_Data.waterTemp, Underway_Data.waterTempQC, Underway_Data.latitude, Underway_Data.latitudeQC, Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data WHERE ((Underway_Data.id % 5) = 0);) rows = cur.fetchall() toc = time.clock() print toc-tic cur.close() conn.close() which runs in ~0.13 seconds. (the query I use is the output of db(query)._select() ) Am I missing something here? Is there anything I can do to increase performance? Regards, Hugh.