On 5/28/2012 10:08 AM, Evan Martin wrote:
Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to
find which points are inside which multi-polygons. Some polygons are
quite large and span the dateline, so I'm using geography ST_DWithin for
this (with a tolerance of 100m). My initial query looks like this
(simplified):

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster
by using ST_SimplifyPreserveTopology. That worked very nicely for my
"adjacent polygons" problem
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
because all polygons were modified in the same way, but this is
trickier. Since I'm modifying the polygon, but not the point, the
results are different. So I thought maybe I could do this in two phases:
if the point is well within or well outside the polygon then take the
result of the "simplified" check as correct, but if it's close to the
border then check it properly, ie.

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 20000)
AND (ST_Distance(point, simple_border) > 20000 OR ST_DWithin(point,
real_area, 100))

simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
0.01)::geography and simple_border = the exterior ring of simple_area.

This takes about 18 minutes (a 5x improvement) and gives very similar
results, but not quite the same. It falls down on polygons that have
rhumblines along parallels, because they get turned into great circle
lines. Eg. the original polyon may have a rhumbline approximated as (24
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats
it as a great circle line, giving an incorrect result. I tried inserting
extra points to "unsimplify" the rhumblines, but that itself is very
slow and also quite a hack, because I can't really be sure which lines
were supposed be rhumblines when looking at the simplified polygon. I
feel like I'm so close and this is a very silly little problem - but it
is a problem.

Could anyone suggest a way to work around this? Or a different approach
to the whole thing?

I think the alternative approach that has been discussed on the list in the past, and should be in the archives, is to cut the multiple polygons into tiles with all the attributes of the original multipolygon and then to match the points to the tiles..

This works much faster for two basic reasons:

1. the number of points in the each tile is much less than the original because it only contains a fraction of the originals complex boundary but maintains all the detail. 2. since each tile is spatially smaller, you get better (faster) interactions between the tile index and the points.

-Steve
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Reply via email to