Some thoughts:
- ST_Distance is likely to be slow, since it does not do any optimizations
- ST_Within was recently optimized to include a fast point-in-polygon
test. Are you using a recent version of PostGIS? It's funny that it
isn't showing an effect in your test. Perhaps you could try the same
query using a non-spatial test, to get a baseline for how long the query
takes to simply rip all the data off disk and run the join.
- does ST_Contains produce the same performance?
Also, we're currently working on "Prepared" (Cached) versions of various
spatial predicates, including ST_Within. Would it be possible for us to
get a sample subset of your data to test and see whether the new code
will improve things?
Dan Erikson wrote:
I need to relate two spatial datasets. One dataset is a point dataset
(~200 Million records), the other is a polygon dataset (~500,000
records). The result of these queries is to be a non-spatial table
listing the id's ("tid" in this case) from each table where there is a
spatial join. I have tried both ST_Within and ST_Distance, and both
queries are incredibly slow (> 100 hours). Running on a very capable
server. Any ideas on faster methods to complete this query?
Thanks!
CREATE TABLE public.step2 as
(SELECT mp.pntgrid_10_tid AS pntgrid_10_tid,
vri.tid AS vri_tid
FROM grids.tbl_mtsa_pnts mp, vri.tbl_vri vri
WHERE mp.geom && vri.geom
AND ST_Within(mp.geom, vri.geom));
CREATE TABLE public.step2 as
(SELECT mp.pntgrid_10_tid AS pntgrid_10_tid,
vri.tid AS vri_tid
FROM grids.tbl_mtsa_pnts mp, vri.tbl_vri vri
WHERE mp.geom && vri.geom
AND ST_Distance(mp.geom, vri.geom) = 0);
--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users