Re: [postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-30 Thread Paul Ramsey
Evan,
Beyond the cutting-up-of-large-things as Stephen suggests, there is
the extra development work to optimize the edge searching routines
with an internal index on the objects (and caching that index once
built). Something for 2.1 :)
P.

On Mon, May 28, 2012 at 10:08 AM, Evan Martin
postgre...@realityexists.net wrote:
 Hi all,

 I have a table of ~350 multi-polygons and ~185,000 points and I need to find
 which points are inside which multi-polygons. Some polygons are quite large
 and span the dateline, so I'm using geography ST_DWithin for this (with a
 tolerance of 100m). My initial query looks like this (simplified):

 SELECT ...
 FROM points, polygons
 WHERE ST_DWithin(point, real_area, 100)

 This works, but it takes about 90 minutes. I'm trying to make it faster by
 using ST_SimplifyPreserveTopology. That worked very nicely for my adjacent
 polygons problem
 [http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
 because all polygons were modified in the same way, but this is trickier.
 Since I'm modifying the polygon, but not the point, the results are
 different. So I thought maybe I could do this in two phases: if the point is
 well within or well outside the polygon then take the result of the
 simplified check as correct, but if it's close to the border then check it
 properly, ie.

 SELECT ...
 FROM points, polygons
 WHERE ST_DWithin(point, simple_area, 2)
    AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point,
 real_area, 100))

 simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
 0.01)::geography and simple_border = the exterior ring of simple_area.

 This takes about 18 minutes (a 5x improvement) and gives very similar
 results, but not quite the same. It falls down on polygons that have
 rhumblines along parallels, because they get turned into great circle lines.
 Eg. the original polyon may have a rhumbline approximated as (24 10,25 10,26
 10,27 10), ST_SimplifyPreserveTopology does its job and simplifies it to (24
 10,27 10) and then ST_DWithin on geography treats it as a great circle line,
 giving an incorrect result. I tried inserting extra points to unsimplify
 the rhumblines, but that itself is very slow and also quite a hack, because
 I can't really be sure which lines were supposed be rhumblines when looking
 at the simplified polygon. I feel like I'm so close and this is a very silly
 little problem - but it is a problem.

 Could anyone suggest a way to work around this? Or a different approach to
 the whole thing?

 Thanks,

 Evan
 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-30 Thread Evan Martin
Thanks, Paul, but do you have any suggestions on how I can cut up the 
polygons properly? As I posted, ST_Intersection doesn't work properly on 
the ones spanning the dateline and I don't know how else to do it. Maybe 
there's some obvious trick I'm missing. (I haven't looked at the code, 
but it seems a bit strange to me that ST_Intersects works, but not 
ST_Intersection - so PostGIS can somehow figure out that they intersect, 
but can't figure out where?)


By the development work do you mean ticket 1796 or something else? 
Either way, any estimate on when it might get done? :)


Regards,

Evan

On 30/05/2012 11:35 PM, Paul Ramsey wrote:

Evan,
Beyond the cutting-up-of-large-things as Stephen suggests, there is
the extra development work to optimize the edge searching routines
with an internal index on the objects (and caching that index once
built). Something for 2.1 :)
P.

On Mon, May 28, 2012 at 10:08 AM, Evan Martin
postgre...@realityexists.net  wrote:

Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to find
which points are inside which multi-polygons. Some polygons are quite large
and span the dateline, so I'm using geography ST_DWithin for this (with a
tolerance of 100m). My initial query looks like this (simplified):

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster by
using ST_SimplifyPreserveTopology. That worked very nicely for my adjacent
polygons problem
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
because all polygons were modified in the same way, but this is trickier.
Since I'm modifying the polygon, but not the point, the results are
different. So I thought maybe I could do this in two phases: if the point is
well within or well outside the polygon then take the result of the
simplified check as correct, but if it's close to the border then check it
properly, ie.

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 2)
AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point,
real_area, 100))

simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
0.01)::geography and simple_border = the exterior ring of simple_area.

This takes about 18 minutes (a 5x improvement) and gives very similar
results, but not quite the same. It falls down on polygons that have
rhumblines along parallels, because they get turned into great circle lines.
Eg. the original polyon may have a rhumbline approximated as (24 10,25 10,26
10,27 10), ST_SimplifyPreserveTopology does its job and simplifies it to (24
10,27 10) and then ST_DWithin on geography treats it as a great circle line,
giving an incorrect result. I tried inserting extra points to unsimplify
the rhumblines, but that itself is very slow and also quite a hack, because
I can't really be sure which lines were supposed be rhumblines when looking
at the simplified polygon. I feel like I'm so close and this is a very silly
little problem - but it is a problem.

Could anyone suggest a way to work around this? Or a different approach to
the whole thing?

Thanks,

Evan
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-30 Thread Paul Ramsey
On Wed, May 30, 2012 at 9:47 AM, Evan Martin
postgre...@realityexists.net wrote:
 Thanks, Paul, but do you have any suggestions on how I can cut up the
 polygons properly? As I posted, ST_Intersection doesn't work properly on the
 ones spanning the dateline and I don't know how else to do it. Maybe there's
 some obvious trick I'm missing. (I haven't looked at the code, but it seems
 a bit strange to me that ST_Intersects works, but not ST_Intersection - so
 PostGIS can somehow figure out that they intersect, but can't figure out
 where?)

Intersects() is run on the spheroid, Intersection is a hack wrapper on
a reprojection into planar and then back again. Did we talk about this
already? Changing the hack to use gnomic instead of UTM might improve
things.

 By the development work do you mean ticket 1796 or something else? Either
 way, any estimate on when it might get done? :)

Yes, that's the ticket. It might get done for 2.1, or it might not.
Money makes the world go 'round. All things being equal it will get
done, since it's something that personally interests me, but things
might not be equal if I'm paid for some other work instead.

P.

 Regards,

 Evan


 On 30/05/2012 11:35 PM, Paul Ramsey wrote:

 Evan,
 Beyond the cutting-up-of-large-things as Stephen suggests, there is
 the extra development work to optimize the edge searching routines
 with an internal index on the objects (and caching that index once
 built). Something for 2.1 :)
 P.

 On Mon, May 28, 2012 at 10:08 AM, Evan Martin
 postgre...@realityexists.net  wrote:

 Hi all,

 I have a table of ~350 multi-polygons and ~185,000 points and I need to
 find
 which points are inside which multi-polygons. Some polygons are quite
 large
 and span the dateline, so I'm using geography ST_DWithin for this (with a
 tolerance of 100m). My initial query looks like this (simplified):

 SELECT ...
 FROM points, polygons
 WHERE ST_DWithin(point, real_area, 100)

 This works, but it takes about 90 minutes. I'm trying to make it faster
 by
 using ST_SimplifyPreserveTopology. That worked very nicely for my
 adjacent
 polygons problem

 [http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
 because all polygons were modified in the same way, but this is trickier.
 Since I'm modifying the polygon, but not the point, the results are
 different. So I thought maybe I could do this in two phases: if the point
 is
 well within or well outside the polygon then take the result of the
 simplified check as correct, but if it's close to the border then check
 it
 properly, ie.

 SELECT ...
 FROM points, polygons
 WHERE ST_DWithin(point, simple_area, 2)
    AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point,
 real_area, 100))

 simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
 0.01)::geography and simple_border = the exterior ring of simple_area.

 This takes about 18 minutes (a 5x improvement) and gives very similar
 results, but not quite the same. It falls down on polygons that have
 rhumblines along parallels, because they get turned into great circle
 lines.
 Eg. the original polyon may have a rhumbline approximated as (24 10,25
 10,26
 10,27 10), ST_SimplifyPreserveTopology does its job and simplifies it to
 (24
 10,27 10) and then ST_DWithin on geography treats it as a great circle
 line,
 giving an incorrect result. I tried inserting extra points to
 unsimplify
 the rhumblines, but that itself is very slow and also quite a hack,
 because
 I can't really be sure which lines were supposed be rhumblines when
 looking
 at the simplified polygon. I feel like I'm so close and this is a very
 silly
 little problem - but it is a problem.

 Could anyone suggest a way to work around this? Or a different approach
 to
 the whole thing?

 Thanks,

 Evan
 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users

 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users


 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-29 Thread Evan Martin
Thanks, Stephen, but I haven't been able to figure out how to get this 
to work on geography. ST_Intersection on geography doesn't work properly 
across the dateline. For example, I tried manually tiling a simple 
polygon that goes across the dateline, like this:


WITH data AS
(
SELECT ST_GeomFromEWKT('SRID=4326;POLYGON((163 50,-176 55,-176 
60,163 60,163 50))')::geography AS simple

),
tiles AS
(
SELECT ST_SetSRID('BOX(145 50,150 60)'::box2d, 4326) AS tile
UNION ALL SELECT ST_SetSRID('BOX(150 50,155 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(155 50,160 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(160 50,165 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(165 50,170 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(170 50,175 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(175 50,180 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(-180 50,-175 60)'::box2d, 4326)
UNION ALL SELECT ST_SetSRID('BOX(-175 50,-170 60)'::box2d, 4326)
)
SELECT ST_AsText(ST_Intersection(simple, tile::geography)) AS tiled
FROM data
JOIN tiles ON ST_Intersects(simple, tile::geography)

This returns:

POLYGON((163 49.99887,160 50.0467770263591,160 
59.99987,163 59.99987,163 49.99887))

GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
POLYGON((-175 54.9860851802267,-176 54.99957,-176 
59.99987,-175 59.99987,-175 54.9860851802267))


I think ST_Intersects works OK on geography, but not ST_Intersection.

Regards,

Evan


On 29/05/2012 2:21 AM, Stephen Woodbridge wrote:

On 5/28/2012 10:08 AM, Evan Martin wrote:

Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to
find which points are inside which multi-polygons. Some polygons are
quite large and span the dateline, so I'm using geography ST_DWithin for
this (with a tolerance of 100m). My initial query looks like this
(simplified):

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster
by using ST_SimplifyPreserveTopology. That worked very nicely for my
adjacent polygons problem
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html], 


because all polygons were modified in the same way, but this is
trickier. Since I'm modifying the polygon, but not the point, the
results are different. So I thought maybe I could do this in two phases:
if the point is well within or well outside the polygon then take the
result of the simplified check as correct, but if it's close to the
border then check it properly, ie.

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 2)
AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point,
real_area, 100))

simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
0.01)::geography and simple_border = the exterior ring of simple_area.

This takes about 18 minutes (a 5x improvement) and gives very similar
results, but not quite the same. It falls down on polygons that have
rhumblines along parallels, because they get turned into great circle
lines. Eg. the original polyon may have a rhumbline approximated as (24
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats
it as a great circle line, giving an incorrect result. I tried inserting
extra points to unsimplify the rhumblines, but that itself is very
slow and also quite a hack, because I can't really be sure which lines
were supposed be rhumblines when looking at the simplified polygon. I
feel like I'm so close and this is a very silly little problem - but it
is a problem.

Could anyone suggest a way to work around this? Or a different approach
to the whole thing?


I think the alternative approach that has been discussed on the list 
in the past, and should be in the archives, is to cut the multiple 
polygons into tiles with all the attributes of the original 
multipolygon and then to match the points to the tiles..


This works much faster for two basic reasons:

1. the number of points in the each tile is much less than the 
original because it only contains a fraction of the originals complex 
boundary but maintains all the detail.
2. since each tile is spatially smaller, you get better (faster) 
interactions between the tile index and the points.


-Steve
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


[postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-28 Thread Evan Martin

Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to 
find which points are inside which multi-polygons. Some polygons are 
quite large and span the dateline, so I'm using geography ST_DWithin for 
this (with a tolerance of 100m). My initial query looks like this 
(simplified):


SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster 
by using ST_SimplifyPreserveTopology. That worked very nicely for my 
adjacent polygons problem 
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html], 
because all polygons were modified in the same way, but this is 
trickier. Since I'm modifying the polygon, but not the point, the 
results are different. So I thought maybe I could do this in two phases: 
if the point is well within or well outside the polygon then take the 
result of the simplified check as correct, but if it's close to the 
border then check it properly, ie.


SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 2)
AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point, 
real_area, 100))


simple_area = ST_SimplifyPreserveTopology(real_area::geometry, 
0.01)::geography and simple_border = the exterior ring of simple_area.


This takes about 18 minutes (a 5x improvement) and gives very similar 
results, but not quite the same. It falls down on polygons that have 
rhumblines along parallels, because they get turned into great circle 
lines. Eg. the original polyon may have a rhumbline approximated as (24 
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and 
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats 
it as a great circle line, giving an incorrect result. I tried inserting 
extra points to unsimplify the rhumblines, but that itself is very 
slow and also quite a hack, because I can't really be sure which lines 
were supposed be rhumblines when looking at the simplified polygon. I 
feel like I'm so close and this is a very silly little problem - but it 
is a problem.


Could anyone suggest a way to work around this? Or a different approach 
to the whole thing?


Thanks,

Evan
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

2012-05-28 Thread Stephen Woodbridge

On 5/28/2012 10:08 AM, Evan Martin wrote:

Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to
find which points are inside which multi-polygons. Some polygons are
quite large and span the dateline, so I'm using geography ST_DWithin for
this (with a tolerance of 100m). My initial query looks like this
(simplified):

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster
by using ST_SimplifyPreserveTopology. That worked very nicely for my
adjacent polygons problem
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
because all polygons were modified in the same way, but this is
trickier. Since I'm modifying the polygon, but not the point, the
results are different. So I thought maybe I could do this in two phases:
if the point is well within or well outside the polygon then take the
result of the simplified check as correct, but if it's close to the
border then check it properly, ie.

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 2)
AND (ST_Distance(point, simple_border)  2 OR ST_DWithin(point,
real_area, 100))

simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
0.01)::geography and simple_border = the exterior ring of simple_area.

This takes about 18 minutes (a 5x improvement) and gives very similar
results, but not quite the same. It falls down on polygons that have
rhumblines along parallels, because they get turned into great circle
lines. Eg. the original polyon may have a rhumbline approximated as (24
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats
it as a great circle line, giving an incorrect result. I tried inserting
extra points to unsimplify the rhumblines, but that itself is very
slow and also quite a hack, because I can't really be sure which lines
were supposed be rhumblines when looking at the simplified polygon. I
feel like I'm so close and this is a very silly little problem - but it
is a problem.

Could anyone suggest a way to work around this? Or a different approach
to the whole thing?


I think the alternative approach that has been discussed on the list in 
the past, and should be in the archives, is to cut the multiple polygons 
into tiles with all the attributes of the original multipolygon and then 
to match the points to the tiles..


This works much faster for two basic reasons:

1. the number of points in the each tile is much less than the original 
because it only contains a fraction of the originals complex boundary 
but maintains all the detail.
2. since each tile is spatially smaller, you get better (faster) 
interactions between the tile index and the points.


-Steve
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users