Besides the suggestions from Dennis below, please search the archives for emails by me on doing exactly this. I achieved fairly decent performance on a database of 7.5 million rows doing lookups on 250k rectangles. I was working on a quad-Xeon server with 4 Gb ram and Win XP, using Perl to work on SQLite. The entire task would take about 23 hours... the performance was nearly linear... slowing down slightly as more records were processed.
On 8/24/07, Dennis Cote <[EMAIL PROTECTED]> wrote: > David Thieme wrote: > > Scott, > > Yes, the SELECT is very simple, but slow. I have tens of thousands of > > records and I need the data very fast (embedded realtime system). Some > > databases natively support spatial searches, using KD-trees or R-Trees or > > Quad-trees to improve the search speed. I found an article that explains > > how to implement a custom-spatial search in SQL 2007: > > "Using Table Valued Functions in SQL Server > > 2005 to Implement a Spatial Data Library" > > But the solution is very specific to SQL server. I thought there might be > > other tricks that might be common for implementing a fast spatial search in > > a database that doesn't natively support this feature. > > > David, > > SQLite has no direct support for spatial searches, but you should be > able to get reasonable results for a table with thousands of records > using a couple of indexes on the latitude and longitude of the points, > assuming your range is a reasonably small part of your total search space. > > Given a schema like this: > > create table pts ( > id integer primary key, > lat real, > lng real, > data text > ); > > You can create two indexes that will speed up the searches for points > within a rectangle. > > create index lat_idx on pts(lat); > create index lng_idx on pts(lng); > > Now, to do the search you can use a query like this: > > select * from pts where id in > ( > select id from pts where lat between :min_lat and :max_lat > intersect > select id from pts where lng between :min_lng and :max_lng > ); > > If you use explain query plan you can see how this will be executed: > > sqlite> explain query plan select * from pts where id in > ...> ( > ...> select id from pts where lat between :min_lat and :max_lat > ...> intersect > ...> select id from pts where lng between :min_lng and :max_lng > ...> ); > 0|0|TABLE pts USING PRIMARY KEY > 0|0|TABLE pts WITH INDEX lat_idx > 0|0|TABLE pts WITH INDEX lng_idx > > Or in all its excruciating detail using explain: > > sqlite> explain select * from pts where id in > ...> ( > ...> select id from pts where lat between :min_lat and :max_lat > ...> intersect > ...> select id from pts where lng between :min_lng and :max_lng > ...> ); > addr opcode p1 p2 p3 > ---- -------------- ---------- ---------- > --------------------------------- > 0 Goto 0 78 > 1 Integer 0 0 > 2 OpenRead 0 2 > 3 SetNumColumns 0 4 > 4 MemLoad 0 0 > 5 If 0 63 > 6 MemInt 1 0 > 7 OpenEphemeral 3 0 keyinfo(1,BINARY) > 8 SetNumColumns 3 1 > 9 OpenEphemeral 4 1 keyinfo(1,BINARY) > 10 Integer 0 0 > 11 OpenRead 6 3 keyinfo(1,BINARY) > 12 SetNumColumns 6 2 > 13 Variable 2 0 :max_lat > 14 IsNull -1 29 > 15 MakeRecord 1 0 e > 16 MemStore 2 1 > 17 Variable 1 0 :min_lat > 18 IsNull -1 29 > 19 MakeRecord 1 0 e > 20 MoveGe 6 29 > 21 MemLoad 2 0 > 22 IdxGE 6 29 + > 23 Column 6 0 > 24 IsNull 1 28 > 25 IdxRowid 6 0 > 26 MakeRecord 1 0 > 27 IdxInsert 4 0 > 28 Next 6 21 > 29 Close 6 0 > 30 OpenEphemeral 5 1 keyinfo(1,BINARY) > 31 Integer 0 0 > 32 OpenRead 7 4 keyinfo(1,BINARY) > 33 SetNumColumns 7 2 > 34 Variable 4 0 :max_lng > 35 IsNull -1 50 > 36 MakeRecord 1 0 e > 37 MemStore 4 1 > 38 Variable 3 0 :min_lng > 39 IsNull -1 50 > 40 MakeRecord 1 0 e > 41 MoveGe 7 50 > 42 MemLoad 4 0 > 43 IdxGE 7 50 + > 44 Column 7 0 > 45 IsNull 1 49 > 46 IdxRowid 7 0 > 47 MakeRecord 1 0 > 48 IdxInsert 5 0 > 49 Next 7 42 > 50 Close 7 0 > 51 Rewind 4 61 > 52 RowKey 4 0 > 53 NotFound 5 60 > 54 Column 4 0 > 55 NotNull -1 58 > 56 Pop 1 0 > 57 Goto 0 60 > 58 MakeRecord 1 0 c > 59 IdxInsert 3 0 > 60 Next 4 52 > 61 Close 5 0 > 62 Close 4 0 > 63 Rewind 3 76 > 64 Column 3 0 > 65 IsNull -1 75 > 66 MustBeInt 1 75 > 67 NotExists 0 75 > 68 Rowid 0 0 > 69 Column 0 1 > 70 RealAffinity 0 0 > 71 Column 0 2 > 72 RealAffinity 0 0 > 73 Column 0 3 > 74 Callback 4 0 > 75 Next 3 64 > 76 Close 0 0 > 77 Halt 0 0 > 78 Transaction 0 0 > 79 VerifyCookie 0 3 > 80 Goto 0 1 > 81 Noop 0 0 > > Basically it scans through the section of the lat_idx between the lat > limits and saves the ids of the matching records, then it does the same > with the lng_idx. Next it loops through all the saved records in the > latitude range and checks if the same record was found in the longitude > range. If so the record id is saved in a final temp table that hold the > intersection set. Finally, it scans through that temp table and looks up > each matching row in the main points table using the primary key index > and returns the data from that row. > > Say you have 10,000 points total and your points are uniformly > distributed. If your search rectangle covers 1% of the width and height > of your total search space, then the first two loops will find 1% of > those records each, or 100 records. The the loop that does the > intersection will look at the 100 records in the horizontal strip that > matches your 1% latitude range and locate the 1 record that is common to > 100 records in the vertical strip that matches your longitude range. It > will then return all the data associated with that one record. > > Because all the lookups are done using indexes (the intersection > operation lets you use both the latitude and longitude indexes, one in > each of the subselects), and the indexes contain the id field so it does > not have to lookup any records in the points table until the last step, > this query should run fairly quickly. > > HTH > Dennis Cote > > > > ----------------------------------------------------------------------------- > To unsubscribe, send email to [EMAIL PROTECTED] > ----------------------------------------------------------------------------- > > -- Puneet Kishor http://punkish.eidesis.org/ Nelson Inst. for Env. Studies http://www.nelson.wisc.edu/ Open Source Geospatial Foundation http://www.osgeo.org/ 2007 Summer S&T Policy Fellow, The National Academies http://www.nas.edu/ ========================================================== collaborate, communicate, compete ========================================================== ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------

