Re: [postgis-users] Tricks to find polygon/line intersection faster

2014-02-10 Thread Rémi Cura
Hey,
geometry equality can be defined in many ways.


For your duplication problem, it is a simple postgres problem :
you want that for any couple (line, poly), you have at most one result
(given polygon are convex, which they are if they are squares).

so at the end of you computing you just add a filtering select :
SELECT DISTINCT ON (line_id,poly_id)

If you don't want random result, you should order your result to know which
line of the 4 you get (you could order by length, centroid, first point,
whatever.)

WITH (your computing)
SELECT DISTINCT ON (line_id, poly_id) , poly, line
FROM your computing
ORDER BY ST_Length(line) ASC

Now if you want to solve this the PostGIS way it would be harder, as you
would need a distance between shape.
(like http://postgis.net/docs/ST_HausdorffDistance.html)

Of course you could also use implicit distance, for example by snapping
your result line to a given precision and doing a regular test after
(WHERE ST_Equals(ST_SnapToGrid(line1, 0.01),ST_SnapToGrid(line2,
0.01))=TRUE )

Cheers,

Rémi-C




2014-02-10 21:33 GMT+01:00 Evan Martin :

> I've discovered a slight problem with the handy "tiled intersection" trick
> suggested earlier: some of my lines run exactly along a meridian or along a
> parallel and so do the tiles, so those intersections get counted twice! For
> example, LINESTRING(-18 14.5,-18 15.5) results in the following
> intersections with a particular tiled polygon
>
> LINESTRING(-17.148001148 14.7502863979935,-17.9998863986648
> 15.002498516)
> LINESTRING(-18.851998852 14.7502863979933,-18.0001136013352
> 15.002498516)
>
> LINESTRING(-18.0001136013352 15.002498516,-18.852013123
> 15.2502965758817)
> LINESTRING(-17.9998863986648 15.002498516,-17.147986877
> 15.250296575882)
>
> Could someone suggest the best (fastest while still accurate) way to
> filter out such duplicates? As you can see, they're not exactly the same,
> so ST_Equals() returns false on them.
>
> Evan
>
> On 08/07/2013 17:36, Evan Martin wrote:
>
>> Thanks, Steve, that seems to do the trick. Of course the results change a
>> bit, so it's a trade-off of accuracy vs. speed. I presume the change is
>> because I do the tiling on the plane - ST_Intersection(geom, geom). When I
>> tried doing tiling on geography the results changed much more (compared to
>> no tiling). Would be interesting to understand why that is. Am I doing
>> something wrong? I create a grid of 1x1 degree polygons and then do this:
>>
>> SELECT poly_id, ST_Intersection(poly_border::geometry,
>> tile)::geography AS poly_tile
>> FROM my_polygon p
>> JOIN world_tile_1 t ON ST_Intersects(p.border::geometry, t.tile)
>>
>> The intersection with lines is then done on geography, as before. I only
>> do this for polygons that don't span the dateline (which is 99% of them,
>> luckily).
>>
>> Evan
>>
>> On 06.07.2013 21:19, Stephen Woodbridge wrote:
>>
>>> The standard way of dealing this this is to chop you really large
>>> polygons into tiles. Or if the multipolygons can be split into multiple
>>> individual polygons you might get better performance.
>>>
>>> google: postgis tiling large polygons
>>>
>>> if you need the distance that the line intersects the multiple tiles or
>>> multiple split multipolygons you will need to sum() and group on the
>>> original id of the split object.
>>>
>>> -Steve
>>>
>>> On 7/6/2013 1:10 PM, Evan Martin wrote:
>>>
 It's not really "many large things vs many large things". Most lines are
 < 100 km long (but there are some over 1000 km). Here's a percentile
 chart: https://imageshack.us/a/img16/940/w5s.png

 Most of the polygons are also quite small and simple, but there are a
 few really large complex ones. From my testing it looks like a few of
 the "worst" polygons (multi-polygons, actually) take all the time, so
 that 25,000 count was a bit misleading. 96% of them have < 100 points,
 but the worst one has > 23,000. I couldn't get the area, because
 ST_Area(geog) is returning some ridiculously high numbers, but it would
 be millions of sq km.

 On 06.07.2013 5:48, Paul Ramsey wrote:

> Without seeing your data it's quite hard to say. Many large things vs
> many large things yields a problem where indexes and so on don't have
> a lot of leverage on the problem.
>
> On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
>  wrote:
>
>> Hi,
>>
>> I have tables of ~25,000 polygons and ~80,000 lines and I want to
>> find which
>> lines intersect which polygons using PostGIS 2.1. Both are
>> geographies and
>> can span the dateline. Doing this the simple way using
>> ST_Intersects(geog,
>> geog) takes about 3 hours on my machine and I'd to see if there's a
>> way to
>> speed this up.
>>
>> I already have indexes on the geography columns and one of them is
>> being
>> used (the one on the lines). Each line only has 2 points, but the

Re: [postgis-users] Tricks to find polygon/line intersection faster

2014-02-10 Thread Evan Martin
I've discovered a slight problem with the handy "tiled intersection" 
trick suggested earlier: some of my lines run exactly along a meridian 
or along a parallel and so do the tiles, so those intersections get 
counted twice! For example, LINESTRING(-18 14.5,-18 15.5) results in the 
following intersections with a particular tiled polygon


LINESTRING(-17.148001148 14.7502863979935,-17.9998863986648 
15.002498516)
LINESTRING(-18.851998852 14.7502863979933,-18.0001136013352 
15.002498516)


LINESTRING(-18.0001136013352 15.002498516,-18.852013123 
15.2502965758817)
LINESTRING(-17.9998863986648 15.002498516,-17.147986877 
15.250296575882)


Could someone suggest the best (fastest while still accurate) way to 
filter out such duplicates? As you can see, they're not exactly the 
same, so ST_Equals() returns false on them.


Evan

On 08/07/2013 17:36, Evan Martin wrote:
Thanks, Steve, that seems to do the trick. Of course the results 
change a bit, so it's a trade-off of accuracy vs. speed. I presume the 
change is because I do the tiling on the plane - ST_Intersection(geom, 
geom). When I tried doing tiling on geography the results changed much 
more (compared to no tiling). Would be interesting to understand why 
that is. Am I doing something wrong? I create a grid of 1x1 degree 
polygons and then do this:


SELECT poly_id, ST_Intersection(poly_border::geometry, 
tile)::geography AS poly_tile

FROM my_polygon p
JOIN world_tile_1 t ON ST_Intersects(p.border::geometry, t.tile)

The intersection with lines is then done on geography, as before. I 
only do this for polygons that don't span the dateline (which is 99% 
of them, luckily).


Evan

On 06.07.2013 21:19, Stephen Woodbridge wrote:
The standard way of dealing this this is to chop you really large 
polygons into tiles. Or if the multipolygons can be split into 
multiple individual polygons you might get better performance.


google: postgis tiling large polygons

if you need the distance that the line intersects the multiple tiles 
or multiple split multipolygons you will need to sum() and group on 
the original id of the split object.


-Steve

On 7/6/2013 1:10 PM, Evan Martin wrote:
It's not really "many large things vs many large things". Most lines 
are

< 100 km long (but there are some over 1000 km). Here's a percentile
chart: https://imageshack.us/a/img16/940/w5s.png

Most of the polygons are also quite small and simple, but there are a
few really large complex ones. From my testing it looks like a few of
the "worst" polygons (multi-polygons, actually) take all the time, so
that 25,000 count was a bit misleading. 96% of them have < 100 points,
but the worst one has > 23,000. I couldn't get the area, because
ST_Area(geog) is returning some ridiculously high numbers, but it would
be millions of sq km.

On 06.07.2013 5:48, Paul Ramsey wrote:

Without seeing your data it's quite hard to say. Many large things vs
many large things yields a problem where indexes and so on don't have
a lot of leverage on the problem.

On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
 wrote:

Hi,

I have tables of ~25,000 polygons and ~80,000 lines and I want to
find which
lines intersect which polygons using PostGIS 2.1. Both are
geographies and
can span the dateline. Doing this the simple way using
ST_Intersects(geog,
geog) takes about 3 hours on my machine and I'd to see if there's a
way to
speed this up.

I already have indexes on the geography columns and one of them is 
being

used (the one on the lines). Each line only has 2 points, but the
polygons
have anywhere from 4 to 20,000 points and some of them are very
large. It
would be OK to miss some of the smaller intersections (ie. where 
the two

only just barely intersect), but I wouldn't want the query to return
false
positives. In fact, ideally, I'd like to find only the lines that
"substantially" intersect a polygon, eg. at least x km or x% of the
line is
in the polygon, but finding any intersections at all would be a 
start.


One trick I tried is ST_SimplifyPreserveTopology. I used that to 
create
simplified version of the polygons (at least those that don't span 
the

dateline) and check those first, then if they intersect then check
the real
polygons. This seems to work, but the performance gains are marginal
compared to the simple approach.

Is there another trick I can use to do this faster? I know
ST_Intersects()
internally calls ST_Distance(), which calculates the distance to a
fraction
of a metre. I don't need that kind of precision, so surely there's 
some

"shorcut" to be found?

Thanks,

Evan


___
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


Re: [postgis-users] To correct Exterior Rings

2014-02-10 Thread Rémi Cura
simplify
simplify preserve topology
polygonise then filter on area,
opening operator (negativ buffer of X then positiv buffer of X)
...

Cheers,

Rémi-C



2014-02-10 13:43 GMT+01:00 Richard LEHAUT :

> Hi
> I succeeded to correct interior rings, but I've problem wih interior rings.
> Looks this screen (black circle). http://imagik.fr/view-rl/74788
> Do you idea to resolve my problem?
> Thanks
> Richard L
>
> ___
> postgis-users mailing list
> postgis-users@lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
___
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

[postgis-users] To correct Exterior Rings

2014-02-10 Thread Richard LEHAUT

Hi
I succeeded to correct interior rings, but I've problem wih interior rings.
Looks this screen (black circle). http://imagik.fr/view-rl/74788
Do you idea to resolve my problem?
Thanks
Richard L

___
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


[postgis-users] To correct Exterior Rings

2014-02-10 Thread Richard LEHAUT

Hi
I succeeded to correct interior rings, but I've problem wih interior rings.
Looks this screen (black circle). http://imagik.fr/view-rl/74788
Do you idea to resolve my problem?
Thanks
Richard L

___
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


Re: [postgis-users] hard upgrade of database fails because of tsvector syntax error

2014-02-10 Thread Scott Pezanowski
Richard and Sandro,
Thank you for your replies and suggestions.
Richard, I think your suggestion is a good one and may have worked. 
Unfortunately, because of various reasons, the original DB does not exist 
anymore.
I may possibly have to let this one go; At least for now. All of the PostGIS 
functionality we use is when data is written to the DB. At this point, we are 
no longer writing to the database, but only reading what has previously been 
processed. Therefore, I think we will be ok with what we have through 
pg_restore. If I have any further updates or resolutions, I will let you know.
Thank you both again for the suggestions.Scott


* Scott Pezanowski 


email: scott...@hotmail.com  
*

> Date: Fri, 31 Jan 2014 09:36:26 +0100
> From: s...@keybit.net
> To: postgis-users@lists.osgeo.org
> Subject: Re: [postgis-users] hard upgrade of database fails because of 
> tsvector syntax error
> 
> On Thu, Jan 30, 2014 at 01:05:45PM -0500, Scott Pezanowski wrote:
> > Sandro,
> > Thank you for your advice and I apologize for the lengthy delay in my 
> > reply. Based on your advice, I seem to have narrowed the problem to one 
> > record the restore script hits and its tsvector value (perhaps there are 
> > other records that may fail after this, but the restore fails to get past 
> > this one). Below is the error I am getting when using postgis_restore.pl.
> > 
> >   Reading list of functions to ignore...  Writing manifest of things to 
> > read from dump file...  Writing ASCII to stdout...ERROR:  syntax error in 
> > tsvector: "'2012':5 'call':8 'elect':17 'gagnam':12 'hurrican':14 'koni':7 
> > 'mayb':10 'olymp':11 'presidenti':16 'rt':1 'sandi':15 'stop':6 'style':13 
> > 'summari':3 'wt"CONTEXT:  COPY twitter_datatable_2012_12, line 46295160, 
> > column textsearchable_index_col_with_stop: "'2012':5 'call':8 'elect':17 
> > 'gagnam':12 'hurrican':14 'koni':7 'mayb':10 'olymp':11 'presidenti':16..."
> > The actual complete tsvector text is shown below.
> > "'2012':5 'call':8 'elect':17 'gagnam':12 'hurrican':14 'koni':7 'mayb':10 
> > 'olymp':11 'presidenti':16 'rt':1 'sandi':15 'stop':6 'style':13 
> > 'summari':3 'wtffact':2"
> > Therefore, there is more text in the tsvector field, which seems to be 
> > being truncated. I do not see anything out of the ordinary with any 
> > characters near where the truncation occurs, that would cause this - Unless 
> > it is perhaps actually failing somewhere other than at the  'wt  part. Do 
> > you think there may be a need for some sort of escape of certain characters 
> > like the apostrophe? Or perhaps a length imposed on the text?
> > I am not quite sure what the problem with the data could be. If I simply 
> > insert the full actual value into the tsvector column, it inserts fine. So 
> > I think it seems like the text being truncated for some reason causes the 
> > "syntax error".
> > Based upon this, can you see what the problem may be or do you have any 
> > other suggestions on things I can look at to try to diagnose the problem?
> 
> First thing I'd do is try to re-create the problem with a new small dump.
> It sounds like postgis_restore.pl is truncating a line, but dunno why
> it would do that (maybe some env variable affecting perl run ?).
> Once you have the small dump you could compare pg_restore output with
> postgis_restore one to see is this is the case (truncation performed
> by postgis_restore..).
> 
> --strk;
> ___
> postgis-users mailing list
> postgis-users@lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
  ___
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users