Re: [sqlite] How to optimize this simple select query ?
Vander Clock Stephane wrote: > Hello, > > But i not understand how the R* tree based table can help me here ? The BETWEEN queries are specifically optimized by R*Tree. > > can you explain me ? > > thanks you by advance > stephane > > On 12/9/2010 8:24 PM, Gabríel A. Pétursson wrote: >> It seems to me that you may want to consider defining your table as a >> virtual R* tree based table. >> >> See http://www.sqlite.org/rtree.html >> >> On Wed, 2010-12-08 at 19:56 +0300, Vander Clock Stephane wrote: >>> Hello, >>> >>> on the table : >>> >>> CREATE TABLE HASH( >>> ID INTEGER PRIMARY KEY ASC, >>> x1_y1 INTEGER, >>> x1_y2 INTEGER, >>> x1_y3 INTEGER, >>> x1_y4 INTEGER, >>> x1_y5 INTEGER, >>> x2_y1 INTEGER, >>> x2_y2 INTEGER, >>> x2_y3 INTEGER, >>> x2_y4 INTEGER, >>> x2_y5 INTEGER, >>> x3_y1 INTEGER, >>> x3_y2 INTEGER, >>> x3_y3 INTEGER, >>> x3_y4 INTEGER, >>> x3_y5 INTEGER, >>> x4_y1 INTEGER, >>> x4_y2 INTEGER, >>> x4_y3 INTEGER, >>> x4_y4 INTEGER, >>> x4_y5 INTEGER, >>> x5_y1 INTEGER, >>> x5_y2 INTEGER, >>> x5_y3 INTEGER, >>> x5_y4 INTEGER, >>> x5_y5 INTEGER >>> ); >>> CREATE INDEX HASH_X1_Y1_IDX ON HASH (X1_Y1); >>> CREATE INDEX HASH_X2_Y2_IDX ON HASH (X2_Y2); >>> CREATE INDEX HASH_X3_Y3_IDX ON HASH (X3_Y3); >>> CREATE INDEX HASH_X4_Y4_IDX ON HASH (X4_Y4); >>> CREATE INDEX HASH_X5_Y5_IDX ON HASH (X5_Y5); >>> >>> >>> with millions of rows, how to optimize such query : >>> >>> Select >>> ID >>> from >>> HASH >>> where >>> x1_y1>=<#randomnumber1> and >>> x1_y1<=<#randomnumber1>+ 20 and >>> x1_y2>=<#randomnumber4> and >>> x1_y2<=<#randomnumber4> + 20 and >>> x1_y3>=<#randomnumber7> and >>> x1_y3<=<#randomnumber7> + 20 and >>> x1_y4>=<#randomnumber10> and >>> x1_y4<=<#randomnumber10> + 20 and >>> x1_y5>=<#randomnumber13> and >>> x1_y5<=<#randomnumber13> + 20 and >>> x2_y1>=<#randomnumber16> and >>> x2_y1<=<#randomnumber16> + 20 and >>> x2_y2>=<#randomnumber19> and >>> x2_y2<=<#randomnumber19> + 20 and >>> x2_y3>=<#randomnumber22> and >>> x2_y3<=<#randomnumber22> + 20 and >>> x2_y4>=<#randomnumber25> and >>> x2_y4<=<#randomnumber25> + 20 and >>> x2_y5>=<#randomnumber28> and >>> x2_y5<=<#randomnumber28> + 20 and >>> x3_y1>=<#randomnumber31> and >>> x3_y1<=<#randomnumber31> + 20 and >>> x3_y2>=<#randomnumber34> and >>> x3_y2<=<#randomnumber34> + 20 and >>> x3_y3>=<#randomnumber37> and >>> x3_y3<=<#randomnumber37> + 20 and >>> x3_y4>=<#randomnumber40> and >>> x3_y4<=<#randomnumber40> + 20 and >>> x3_y5>=<#randomnumber43> and >>> x3_y5<=<#randomnumber43> + 20 and >>> x4_y1>=<#randomnumber46> and >>> x4_y1<=<#randomnumber46> + 20 and >>> x4_y2>=<#randomnumber49> and >>> x4_y2<=<#randomnumber49> + 20 and >>> x4_y3>=<#randomnumber52> and >>> x4_y3<=<#randomnumber52> + 20 and >>> x4_y4>=<#randomnumber55> and >>> x4_y4<=<#randomnumber55> + 20 and >>> x4_y5>=<#randomnumber58> and >>> x4_y5<=<#randomnumber58> + 20 and >>> x5_y1>=<#randomnumber61> and >>> x5_y1<=<#randomnumber61> + 20 and >>> x5_y2>=<#randomnumber64> and >>> x5_y2<=<#randomnumber64> + 20 and >>> x5_y3>=<#randomnumber67> and >>> x5_y3<=<#randomnumber67> + 20 and >>> x5_y4>=<#randomnumber70> and >>> x5_y4<=<#randomnumber70> + 20 and >>> x5_y5>=<#randomnumber73> and >>> x5_y5<=<#randomnumber73> + 20; >>> >>> because they takes very very lot of time (hourS) to return :( >>> on other SGBD (like Firebird) with same amount of data >>> they return immediatly ... >>> >>> Thanks by advance >>> stéphane >>> ___ >>> sqlite-users mailing list >>> sqlite-users@sqlite.org >>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Puneet Kishor http://punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Fellow http://creativecommons.org/about/people/fellows#puneetkishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu --- Assertions are politics; backing up assertions with evidence is science === ___ sqlite-users mailing list sqlite-users@sqli
Re: [sqlite] How to optimize this simple select query ?
Hello, But i not understand how the R* tree based table can help me here ? can you explain me ? thanks you by advance stephane On 12/9/2010 8:24 PM, Gabríel A. Pétursson wrote: > It seems to me that you may want to consider defining your table as a > virtual R* tree based table. > > See http://www.sqlite.org/rtree.html > > On Wed, 2010-12-08 at 19:56 +0300, Vander Clock Stephane wrote: >> Hello, >> >> on the table : >> >> CREATE TABLE HASH( >> ID INTEGER PRIMARY KEY ASC, >> x1_y1 INTEGER, >> x1_y2 INTEGER, >> x1_y3 INTEGER, >> x1_y4 INTEGER, >> x1_y5 INTEGER, >> x2_y1 INTEGER, >> x2_y2 INTEGER, >> x2_y3 INTEGER, >> x2_y4 INTEGER, >> x2_y5 INTEGER, >> x3_y1 INTEGER, >> x3_y2 INTEGER, >> x3_y3 INTEGER, >> x3_y4 INTEGER, >> x3_y5 INTEGER, >> x4_y1 INTEGER, >> x4_y2 INTEGER, >> x4_y3 INTEGER, >> x4_y4 INTEGER, >> x4_y5 INTEGER, >> x5_y1 INTEGER, >> x5_y2 INTEGER, >> x5_y3 INTEGER, >> x5_y4 INTEGER, >> x5_y5 INTEGER >> ); >> CREATE INDEX HASH_X1_Y1_IDX ON HASH (X1_Y1); >> CREATE INDEX HASH_X2_Y2_IDX ON HASH (X2_Y2); >> CREATE INDEX HASH_X3_Y3_IDX ON HASH (X3_Y3); >> CREATE INDEX HASH_X4_Y4_IDX ON HASH (X4_Y4); >> CREATE INDEX HASH_X5_Y5_IDX ON HASH (X5_Y5); >> >> >> with millions of rows, how to optimize such query : >> >> Select >> ID >> from >> HASH >> where >> x1_y1>=<#randomnumber1> and >> x1_y1<=<#randomnumber1>+ 20 and >> x1_y2>=<#randomnumber4> and >> x1_y2<=<#randomnumber4> + 20 and >> x1_y3>=<#randomnumber7> and >> x1_y3<=<#randomnumber7> + 20 and >> x1_y4>=<#randomnumber10> and >> x1_y4<=<#randomnumber10> + 20 and >> x1_y5>=<#randomnumber13> and >> x1_y5<=<#randomnumber13> + 20 and >> x2_y1>=<#randomnumber16> and >> x2_y1<=<#randomnumber16> + 20 and >> x2_y2>=<#randomnumber19> and >> x2_y2<=<#randomnumber19> + 20 and >> x2_y3>=<#randomnumber22> and >> x2_y3<=<#randomnumber22> + 20 and >> x2_y4>=<#randomnumber25> and >> x2_y4<=<#randomnumber25> + 20 and >> x2_y5>=<#randomnumber28> and >> x2_y5<=<#randomnumber28> + 20 and >> x3_y1>=<#randomnumber31> and >> x3_y1<=<#randomnumber31> + 20 and >> x3_y2>=<#randomnumber34> and >> x3_y2<=<#randomnumber34> + 20 and >> x3_y3>=<#randomnumber37> and >> x3_y3<=<#randomnumber37> + 20 and >> x3_y4>=<#randomnumber40> and >> x3_y4<=<#randomnumber40> + 20 and >> x3_y5>=<#randomnumber43> and >> x3_y5<=<#randomnumber43> + 20 and >> x4_y1>=<#randomnumber46> and >> x4_y1<=<#randomnumber46> + 20 and >> x4_y2>=<#randomnumber49> and >> x4_y2<=<#randomnumber49> + 20 and >> x4_y3>=<#randomnumber52> and >> x4_y3<=<#randomnumber52> + 20 and >> x4_y4>=<#randomnumber55> and >> x4_y4<=<#randomnumber55> + 20 and >> x4_y5>=<#randomnumber58> and >> x4_y5<=<#randomnumber58> + 20 and >> x5_y1>=<#randomnumber61> and >> x5_y1<=<#randomnumber61> + 20 and >> x5_y2>=<#randomnumber64> and >> x5_y2<=<#randomnumber64> + 20 and >> x5_y3>=<#randomnumber67> and >> x5_y3<=<#randomnumber67> + 20 and >> x5_y4>=<#randomnumber70> and >> x5_y4<=<#randomnumber70> + 20 and >> x5_y5>=<#randomnumber73> and >> x5_y5<=<#randomnumber73> + 20; >> >> because they takes very very lot of time (hourS) to return :( >> on other SGBD (like Firebird) with same amount of data >> they return immediatly ... >> >> Thanks by advance >> stéphane >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
On 10 Dec 2010, at 8:18pm, Vander Clock Stephane wrote: > it's sad that this simple select is not possible under sqlite3 :( What ? I told you how to do it. In fact I told you two ways. Go back and read my post again. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
thanks for the long explanation pavel ! On 12/10/2010 11:50 PM, Pavel Ivanov wrote: >> it's sad that this simple select is not possible under sqlite3 :( > This query is not so simple as you think, it requires pretty > complicated advanced techniques to be executed differently than SQLite > executes it. And even using those techniques you are not guaranteed to > have good performance - it depends on data distribution in your table. > To get an idea of how complicated technique should be try to think of > your query in terms of phone book. So you have 6 phone books with the > same data (million phones in total). 1st book has all data unsorted, > 2nd has all data sorted by 1st letter (no tie-breaking sorting), 3rd - > all data sorted by 2nd letter (no tie-breaking sorting) and so on. Now > you want to find phones of all people whose name has first letter > between d and i, second letter between t and v, third letter between f > and k and so on. How would you search for these phones? It's pretty > complicated stuff and wasn't included in SQLite because it's Lite. > > > Pavel > > On Fri, Dec 10, 2010 at 3:18 PM, Vander Clock Stephane > wrote: >> not work :( >> >> it's sad that this simple select is not possible under sqlite3 :( >> >> >> >> On 12/10/2010 6:11 PM, Jim Morris wrote: >>> Did you try a compound index? >>> >>> ___ >>> sqlite-users mailing list >>> sqlite-users@sqlite.org >>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
> it's sad that this simple select is not possible under sqlite3 :( This query is not so simple as you think, it requires pretty complicated advanced techniques to be executed differently than SQLite executes it. And even using those techniques you are not guaranteed to have good performance - it depends on data distribution in your table. To get an idea of how complicated technique should be try to think of your query in terms of phone book. So you have 6 phone books with the same data (million phones in total). 1st book has all data unsorted, 2nd has all data sorted by 1st letter (no tie-breaking sorting), 3rd - all data sorted by 2nd letter (no tie-breaking sorting) and so on. Now you want to find phones of all people whose name has first letter between d and i, second letter between t and v, third letter between f and k and so on. How would you search for these phones? It's pretty complicated stuff and wasn't included in SQLite because it's Lite. Pavel On Fri, Dec 10, 2010 at 3:18 PM, Vander Clock Stephane wrote: > not work :( > > it's sad that this simple select is not possible under sqlite3 :( > > > > On 12/10/2010 6:11 PM, Jim Morris wrote: >> Did you try a compound index? >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
not work :( it's sad that this simple select is not possible under sqlite3 :( On 12/10/2010 6:11 PM, Jim Morris wrote: > Did you try a compound index? > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
> Did you try a compound index? Given WHERE conditions wouldn't have any benefit from compound index - only first column will be used anyway. Pavel On Fri, Dec 10, 2010 at 10:11 AM, Jim Morris wrote: > Did you try a compound index? > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
Did you try a compound index? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
> I usually seem to be wrong when I try to help here, but I keep trying. > > My guess is that SQLite uses only one index per query (or per table per > query or something like that), and so has to do a whole bunch of full > table scans (or at least full scans of the remaining rows). > huum if it's true it's a big disaster :( > Maybe you need a query like: > > > select id from (select id from hash where > > x1_y1 BETWEEN #randomnumber1 AND (#randomnumber1 + 20)) where x1_y2 BETWEEN > #randomnumber4 AND (#randomnumber4 + 20) > > > nested many more levels deep. This might allow using all the indices. i will try thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
On 12/8/2010 9:56 AM, Vander Clock Stephane wrote: > Hello, > > on the table : > > CREATE TABLE HASH( > ID INTEGER PRIMARY KEY ASC, > x1_y1 INTEGER, > x1_y2 INTEGER, > ... > x5_y5 INTEGER > ); > CREATE INDEX HASH_X1_Y1_IDX ON HASH (X1_Y1); > CREATE INDEX HASH_X2_Y2_IDX ON HASH (X2_Y2); > CREATE INDEX HASH_X3_Y3_IDX ON HASH (X3_Y3); > CREATE INDEX HASH_X4_Y4_IDX ON HASH (X4_Y4); > CREATE INDEX HASH_X5_Y5_IDX ON HASH (X5_Y5); > > > with millions of rows, how to optimize such query : > > Select > ID > from > HASH > where > x1_y1>=<#randomnumber1> and > x1_y1<=<#randomnumber1>+ 20 and > ... > x5_y5>=<#randomnumber73> and > x5_y5<=<#randomnumber73> + 20; > > because they takes very very lot of time (hourS) to return :( > on other SGBD (like Firebird) with same amount of data > they return immediatly ... > I usually seem to be wrong when I try to help here, but I keep trying. My guess is that SQLite uses only one index per query (or per table per query or something like that), and so has to do a whole bunch of full table scans (or at least full scans of the remaining rows). Maybe you need a query like: select id from (select id from hash where x1_y1 BETWEEN #randomnumber1 AND (#randomnumber1 + 20)) where x1_y2 BETWEEN #randomnumber4 AND (#randomnumber4 + 20) nested many more levels deep. This might allow using all the indices. HTH, Gerry ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to optimize this simple select query ?
It seems to me that you may want to consider defining your table as a virtual R* tree based table. See http://www.sqlite.org/rtree.html On Wed, 2010-12-08 at 19:56 +0300, Vander Clock Stephane wrote: > Hello, > > on the table : > > CREATE TABLE HASH( >ID INTEGER PRIMARY KEY ASC, >x1_y1 INTEGER, >x1_y2 INTEGER, >x1_y3 INTEGER, >x1_y4 INTEGER, >x1_y5 INTEGER, >x2_y1 INTEGER, >x2_y2 INTEGER, >x2_y3 INTEGER, >x2_y4 INTEGER, >x2_y5 INTEGER, >x3_y1 INTEGER, >x3_y2 INTEGER, >x3_y3 INTEGER, >x3_y4 INTEGER, >x3_y5 INTEGER, >x4_y1 INTEGER, >x4_y2 INTEGER, >x4_y3 INTEGER, >x4_y4 INTEGER, >x4_y5 INTEGER, >x5_y1 INTEGER, >x5_y2 INTEGER, >x5_y3 INTEGER, >x5_y4 INTEGER, >x5_y5 INTEGER > ); > CREATE INDEX HASH_X1_Y1_IDX ON HASH (X1_Y1); > CREATE INDEX HASH_X2_Y2_IDX ON HASH (X2_Y2); > CREATE INDEX HASH_X3_Y3_IDX ON HASH (X3_Y3); > CREATE INDEX HASH_X4_Y4_IDX ON HASH (X4_Y4); > CREATE INDEX HASH_X5_Y5_IDX ON HASH (X5_Y5); > > > with millions of rows, how to optimize such query : > > Select >ID > from >HASH > where >x1_y1 >= <#randomnumber1> and >x1_y1 <= <#randomnumber1>+ 20 and >x1_y2 >= <#randomnumber4> and >x1_y2 <= <#randomnumber4> + 20 and >x1_y3 >= <#randomnumber7> and >x1_y3 <= <#randomnumber7> + 20 and >x1_y4 >= <#randomnumber10> and >x1_y4 <= <#randomnumber10> + 20 and >x1_y5 >= <#randomnumber13> and >x1_y5 <= <#randomnumber13> + 20 and >x2_y1 >= <#randomnumber16> and >x2_y1 <= <#randomnumber16> + 20 and >x2_y2 >= <#randomnumber19> and >x2_y2 <= <#randomnumber19> + 20 and >x2_y3 >= <#randomnumber22> and >x2_y3 <= <#randomnumber22> + 20 and >x2_y4 >= <#randomnumber25> and >x2_y4 <= <#randomnumber25> + 20 and >x2_y5 >= <#randomnumber28> and >x2_y5 <= <#randomnumber28> + 20 and >x3_y1 >= <#randomnumber31> and >x3_y1 <= <#randomnumber31> + 20 and >x3_y2 >= <#randomnumber34> and >x3_y2 <= <#randomnumber34> + 20 and >x3_y3 >= <#randomnumber37> and >x3_y3 <= <#randomnumber37> + 20 and >x3_y4 >= <#randomnumber40> and >x3_y4 <= <#randomnumber40> + 20 and >x3_y5 >= <#randomnumber43> and >x3_y5 <= <#randomnumber43> + 20 and >x4_y1 >= <#randomnumber46> and >x4_y1 <= <#randomnumber46> + 20 and >x4_y2 >= <#randomnumber49> and >x4_y2 <= <#randomnumber49> + 20 and >x4_y3 >= <#randomnumber52> and >x4_y3 <= <#randomnumber52> + 20 and >x4_y4 >= <#randomnumber55> and >x4_y4 <= <#randomnumber55> + 20 and >x4_y5 >= <#randomnumber58> and >x4_y5 <= <#randomnumber58> + 20 and >x5_y1 >= <#randomnumber61> and >x5_y1 <= <#randomnumber61> + 20 and >x5_y2 >= <#randomnumber64> and >x5_y2 <= <#randomnumber64> + 20 and >x5_y3 >= <#randomnumber67> and >x5_y3 <= <#randomnumber67> + 20 and >x5_y4 >= <#randomnumber70> and >x5_y4 <= <#randomnumber70> + 20 and >x5_y5 >= <#randomnumber73> and >x5_y5 <= <#randomnumber73> + 20; > > because they takes very very lot of time (hourS) to return :( > on other SGBD (like Firebird) with same amount of data > they return immediatly ... > > Thanks by advance > stéphane > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users