On Mar 29, 2012, at 10:10 AM, Josh Doe wrote: > On Thu, Mar 29, 2012 at 9:45 AM, Ian Dees <ian.d...@gmail.com> wrote: >> After loading Cook County TIGER road features and OSM linear features into >> PostGIS, I ran a simple query to find how well the roads matched: >> >> SELECT a.name, b.fullname, ST_HausdorffDistance(a.geom, b.geom) as dist >> FROM cook_tiger a, cook_osm b >> WHERE (a.geom && b.geom) AND ST_HausdorffDistance(a.geom, b.geom) < >> 0.0005 >> LIMIT 50 >> >> This returned results that made sense (the names matched in all 50 results). >> >> I removed the LIMIT clause and let it run before going to work to see how >> many of the TIGER records match existing OSM features. >> >> Next up is building a table of TIGER -> OSM matches and using that to find >> TIGER rows that don't have a corresponding OSM feature. >> >> If anyone has any ideas for speeding this up I'd love to hear it. It took >> well over a couple hours to run one county. There are a lot of counties in >> the US. > > Very cool! To speed this up perhaps try limiting the number of times > ST_HausdorffDistance is executed. First only run it for ways which are > "close", such as falling inside a buffer, or even faster inside a > bounding box. For a trivial speedup generate a table with distances > first, then use the WHERE clause. However I have no idea how to form > such queries!
If you're using one of the prereleases of PostGIS 2 + PostgreSQL 9.1 you can also do a nearest neighbor query (if you've build a GiST index on the coordinates). If you haven't already built the index, however, better off just going with a bounding box function as Josh mentioned.. -Skye _______________________________________________ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us