Here's another way I call a "Keybox search." -John Coryat
Performing a Spatial Search without a Spatially Enabled Database Author: John D. Coryat, USNaviguide (http://www.usnaviguide.com and http://maps.huge.info ) September 22, 2007 Most mapping applications that involve querying a database have a requirement to filter records by either distance to some object or being contained in a polygon. These queries can be cumbersome and slow, especially if using a non-spatially enabled database such as MySQL or MS SQL Server. Even using a database capable of doing these functions internally such as PostgreSQL, PostGIS or Oracle can be quite slow, especially if there are huge number of records involved. A method that reduces the time and complexity involved with filtering out records is to perform a search of the database utilizing a fast and efficient native function to all database systems, the integer key (an integer column in the database with an index attached). The trick to doing this is to convert a coordinate into an integer, and then select out the values that would include all possible candidates while not rejecting any valid ones. I developed an algorithm that performs this integer key search which is fast, efficient, flexible and easy to code. It's particularly well suited for large databases and will work in any language, and virtually any database. I call this method the "Keybox Search." A keybox search computes likely candidates from this key and the "boxes" surrounding it. The "boxes" can be three, five or seven boxes on each side. It can be larger, but instead of increasing the number beyond five squares, it is better to change the key to be less precise (see example). The reason for "odd" numbers of "boxes" is that there should be the same number of "boxes" on each side of the coordinate, so figure one for the value of the key (middle), then two to the left and two to the right adds up to: 2 + 1 + 2 = 5 or 25 (five vertical and five horizontal) boxes, for one on each side, that works out to 1 + 1 + 1 = 3 or 9 (3x3=9) keys. If this seems confusing, take a piece of graph paper and mark a spot in the center, then mark every box around it, the total of your marked boxes is nine, that's a three square box search. Lets include a diagram here of that. I also think this should be moved down, see my final comment. -Pamela 9/23/07 12:43 PM I use this algorithm in a number of applications, from searching a database of over 100,000,000 records in the reverse geocoder to locating a point in a polygon with a database of 100,000 polygons. I use PostgreSQL and PostGIS databases, which both have spatial search functions built in, but limiting the number of records that actually hit the built in functions speeds the search up dramatically. The keybox search method takes a point and transforms it into an integer in this manner: Example: (5 mile bounding box for a polygon, 1 degree ~ 50 miles or 1/10th a degree) 35.12345,-90.54321 1/10th degree accuracy: 35.1,-90.5 Translate into an integer: absolute value(round to zero decimals(35.12345 x 10)) x 10000 = 3510000 absolute value(round to zero decimals(-90.54321 x 10)) = 905 3510000 + 905 = 3510905 : this is the integer key. To construct a list of keybox keys to search for, a loop within a loop is constructed that iterates through the possible values for each latitude and longitude, rounded for precision to meet the requirements of the appropriate box. See the code example at the end of this article for details. In the example above, using a keybox search based on a 3x3 box, the keys to include would be: 3500904 3500905 3500906 3510904 3510905 3510906 3520904 3520905 3520906 Note the key in the middle is the same as the starting key. The reason the surrounding keys are required is due to the position of the point may be near the extremes of the rounding. To catch all possible candidates, the box search is required. This will include any keybox that is within about 5-7 miles of the original point. While this still may include a large number of candidates, it will certainly exclude a huge number, all without ever doing any spatial calculations. This method is extremely fast, as integer keys are the fastest type of index search in a database, regardless of database type. To calculate a keybox key for a polygon, take the bounding box and compute a centroid. This is a good likely point for a polygon. In SQL terms, this calculation will create keybox keys based on the bounding box (PostgreSQL syntax): update mygeometrytable set keybox = (round(abs((center(box(poly)))[1])*10) * 10000) + round(abs((center(box(poly)))[0])*10) ; The keybox search method is not generic. It has to be modified for each intended use by determining the maximum size of polygon or the distance from a point that will be searched. This can be adjusted by altering the precision of the key and by the number of keybox's being searched. A 5x5 grid of 1/10th degree precision will yield about a 12 mile bounding box, depending on latitude. It is better to err on the side of large then small, as the results are still very fast even if too many candidates get through the integer screening. By erring on the large side, the calculation will never fail. International note: This code is optimized for the US, as the latitudes are always positive and the longitudes are always negative. For other situations, the keybox method can easily be modified to include something in the key to indicate the difference, like adding a digit to the front of the key. As long as the database doesn't contain any records that violate local conditions, no changes are required. Here is a perl code to construct a query (3x3 box). #!/usr/bin/perl -w # Sample Perl code to construct a 3x3 keybox... # Author: John D. Coryat, USNaviguide 09/2007... use strict ; my $h = 0 ; my $i = 0 ; my $j = 0 ; my $k = 0 ; my $x = '' ; my $sql = '' ; my @t = ( ) ; my $lat = 35.12345 ; my $lng = -90.54321 ; $j = sprintf( "%0.0f", abs($lat * 10) ) ; # Latitude converted to 3 digits $k = sprintf( "%0.0f", abs($lng * 10) ) ; # Longitude convered to 4 digits # Reduce the starting point... # Use 1: 3x3, 2: 5x5, 3: 7x7... $j = $j - 1 ; # Subtract to the low latitude $k = $k - 1 ; # Subtract to the low longitude # Construct an array of search keys... # Use 3: 9 key, 5: 25 key or 7: 49 key... # Use 1000: 1 Degree, 10000: 1/10th, 100000: 1/100th... for ( $i = 0; $i < 3; $i++ ) # Loop through latitude { for ( $h = 0; $h < 3; $h++ ) # Loop through longitude { push( @t, ($j + $i) * 10000 + ($k + $h) ) ; # Change 10000 based on precision } } # Loop through search key array and construct a string with the "or" SQL statements... # Use 9: 3x3, 25: 5x5 or 49: 7x7... $x = '' ; for ( $i = 0; $i < 9; $i++ ) { $x .= " keybox = $t[$i] or" ; } # Trim off the last 3 characters ( or) from the string... $x = substr($x, 0, length($x) - 3) ; # Trim off extra # Create the select statement for point submitted in the polygon database... $sql = "select * from mygeometrytable where ($x) and point '($lat,$lng)' @ poly" ; print "$sql\n" ; Here's the output from the sample program: select * from mygeometrytable where ( keybox = 3500904 or keybox = 3500905 or keybox = 3500906 or keybox = 3510904 or keybox = 3510905 or keybox = 3510906 or keybox = 3520904 or keybox = 3520905 or keybox = 3520906 ) and point '(35.12345,-90.54321)' @ poly -- You received this message because you are subscribed to the Google Groups "Google Maps API V2" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-maps-api/-/By_55c97hGkJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-maps-api?hl=en.
