Re: [postgis-users] Performance issue with ST_INTERSECTS

2011-11-04 Thread Shira Bezalel
Hi Paul, 

Thank you for your suggestion. I gave it a try and now the query is running in 
about 22 seconds. Amazing! If you have time, can you shed some light on why 
this makes such a difference? What is it about the spatial indexes of smaller 
regions that make them more effective? Thank you again, 

Shira 

- Original Message -

From: "Paul Ramsey"  
To: "PostGIS Users Discussion"  
Sent: Thursday, November 3, 2011 3:26:24 PM 
Subject: Re: [postgis-users] Performance issue with ST_INTERSECTS 

Just from your description, my guess is that your few, large, regions 
with many vertices are the problem. Since they are static, I'd 
recommend cutting them up into a much larger number of small regions, 
by for example intersecting them with a regular grid. Then you can get 
much more effect from your indexes. 

P. 

On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel  wrote: 
> Hello List, 
> 
> This is my first post so feel free to let me know if I'm missing any critical 
> info or if this is better suited for the pgsql-performance list. 
> 
> Essentially, I'm troubleshooting a performance issue with a spatial 
> intersection query. The query sums the lengths of line features that 
> intersect polygon features. The line layer is very dense. The polygon layer 
> only has 8 features, but each polygon does have a lot of points. 
> 
> It's taking almost 50 minutes to run and I'm wondering if the planner is 
> choosing a less than optimal query plan. One reason I'm thinking this is 
> based on a bad row estimate in the explain analyze output. Can anything be 
> done to resolve this bad estimate? Or is it unavoidable? And more 
> importantly: Is the query plan that the planner is choosing the fastest one 
> that it could use despite the bad row estimate? 
> 
> -Version Info- 
> Production Server: 
> PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 
> 4.4.3-4ubuntu5) 4.4.3, 64-bit 
> POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " 
> LIBXML="2.7.6" USE_STATS 
> 
> We also tested the query against a test server with more recent postgres 
> product versions (the same query plan and bad row estimate occurred): 
> PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real 
> (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit 
> POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " 
> LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade) 
> 
> -SQL Statement- 
> SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * 
> 0.000621371192 AS milesum 
> FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, 
> n.the_geom) 
> WHERE n.fcode = 46003 
> GROUP BY r.gid 
> 
> -Explain Analyze Output- 
> 
> See here: http://explain.depesz.com/s/Esx 
> 
> HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 
> .455..2823674 .458 rows=8 loops=1) 
> -> Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual 
> time=23.893..675119.593 rows=90268 loops=1) 
> Join Filter: _st_intersects(r.the_geom, n.the_geom) 
> -> Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual 
> time=0.005..0.019 rows=8 loops=1) 
> -> Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n 
> (cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 
> loops=8) 
> Index Cond: (r.the_geom && n.the_geom) 
> Filter: (n.fcode = 46003) 
> Total runtime: 2823674.517 ms 
> 
> ***Notice actual rows in the first index scan is 41631, but the estimate is 
> 1. 
> 
> -Additional Info- 
> - I've run vacuum analyze on the tables in question. That has not helped. 
> (Autovacuum does run regularly also.) 
> - In terms of history, this query has always been slow. 
> - Various configuration settings: 
> shared_buffers: 3GB (Machine RAM: 12GB) 
> work_mem: 2GB 
> maintenance_work_mem: 1GB 
> effective_cache_size: 2GB (this was originally set to the default; we 
> increased it to 2GB and restarted the server, but it didn't change the query 
> plan.) 
> - Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS. 
> 
> -Table and Index Schema- 
> 
> Table "public.psaregions" 
> Column | Type | Modifiers 
> +---+--
>  
> gid | integer | not null default nextval('psaregions_gid_seq'::regclass) 
> psa_region | character varying(50) | 
> np_length | numeric | 
> xxx | numeric | 
> the_geom | geometry | 
> sm_geom | geometry | 
> miles46006 | numeric | 
> miles46003 | numeric | 
> llextent | character varying | 
> Indexes: 
> "psaregions_pkey" PRIMARY KEY, btree (gid) 
> "psaregions_region_idx" btree (psa_region) 
> "psaregions_sm_geom_gist" gist (sm_geom) 
> "psaregions_the_geom_gist" gist (the_geom) 
> 
> 
> Table "public.nhd100kstreams" 
> Column | Type | Modifiers 
> +---+---

Re: [postgis-users] Slow Query Times for Split Tiles

2011-11-04 Thread Mateusz Łoskot
On 4 November 2011 12:50, elliott  wrote:
> Yes, I am certain that the table has been indexed. If I use the
> raster2pgsql.py script with the -I option to load additional data, it
> complains that the table is already indexed.
>
>  Querying for 10,000 lat/lons on locations that have not been imported into
> the database is very fast returning in 2 seconds.   It can very quickly
> determine that there are no intersections.  However, when there are
> intersections for all 10,000 lat/lons, the query is taking over 2 minutes.

Do you store rasters in- or out-database?
Anyway, query itself is fast, I'm sure. But deserialization of 1
rasters to find and
fetch value of single pixel, that sounds expansive on its own.
Try to imagine, or prototype it with GDAL and Python, opening 1
.tif files, one by one
and RasterIO'ing value of pixel.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Performance issue with ST_INTERSECTS

2011-11-04 Thread Martin Davis
This should be making use of PreparedGeometry, right?  Meaning that the 
actual intersection computation should be fairly performant?  So is it 
just the fact that the index is not very selective that is causing the 
performance issue?


Just asking in case this is revealing some kind of issue with the 
PreparedGeometry implementation...


Martin

On 11/3/2011 3:26 PM, Paul Ramsey wrote:

Just from your description, my guess is that your few, large, regions
with many vertices are the problem. Since they are static, I'd
recommend cutting them up into a much larger number of small regions,
by for example intersecting them with a regular grid. Then you can get
much more effect from your indexes.

P.

On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel  wrote:

Hello List,

This is my first post so feel free to let me know if I'm missing any critical 
info or if this is better suited for the pgsql-performance list.

Essentially, I'm troubleshooting a performance issue with a spatial 
intersection query. The query sums the lengths of line features that intersect 
polygon features. The line layer is very dense. The polygon layer only has 8 
features, but each polygon does have a lot of points.

It's taking almost 50 minutes to run and I'm wondering if the planner is 
choosing a less than optimal query plan. One reason I'm thinking this is based 
on a bad row estimate in the explain analyze output. Can anything be done to 
resolve this bad estimate? Or is it unavoidable? And more importantly: Is the 
query plan that the planner is choosing the fastest one that it could use 
despite the bad row estimate?

-Version Info-
Production Server:
PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 
4.4.3-4ubuntu5) 4.4.3, 64-bit
POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " 
LIBXML="2.7.6" USE_STATS

We also tested the query against a test server with more recent postgres 
product versions (the same query plan and bad row estimate occurred):
PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real 
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " 
LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade)

-SQL Statement-
SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * 
0.000621371192 AS milesum
FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, n.the_geom)
WHERE n.fcode = 46003
GROUP BY r.gid

-Explain Analyze Output-

See here: http://explain.depesz.com/s/Esx

HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 
.455..2823674 .458 rows=8 loops=1)
->  Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual 
time=23.893..675119.593 rows=90268 loops=1)
Join Filter: _st_intersects(r.the_geom, n.the_geom)
->  Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual 
time=0.005..0.019 rows=8 loops=1)
->  Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n 
(cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 loops=8)
Index Cond: (r.the_geom&&  n.the_geom)
Filter: (n.fcode = 46003)
Total runtime: 2823674.517 ms

***Notice actual rows in the first index scan is 41631, but the estimate is 1.

-Additional Info-
- I've run vacuum analyze on the tables in question. That has not helped. 
(Autovacuum does run regularly also.)
- In terms of history, this query has always been slow.
- Various configuration settings:
shared_buffers: 3GB (Machine RAM: 12GB)
work_mem: 2GB
maintenance_work_mem: 1GB
effective_cache_size: 2GB (this was originally set to the default; we increased 
it to 2GB and restarted the server, but it didn't change the query plan.)
- Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS.

-Table and Index Schema-

Table "public.psaregions"
Column | Type | Modifiers
+---+--
gid | integer | not null default nextval('psaregions_gid_seq'::regclass)
psa_region | character varying(50) |
np_length | numeric |
xxx | numeric |
the_geom | geometry |
sm_geom | geometry |
miles46006 | numeric |
miles46003 | numeric |
llextent | character varying |
Indexes:
"psaregions_pkey" PRIMARY KEY, btree (gid)
"psaregions_region_idx" btree (psa_region)
"psaregions_sm_geom_gist" gist (sm_geom)
"psaregions_the_geom_gist" gist (the_geom)


Table "public.nhd100kstreams"
Column | Type | Modifiers
+---+--
gid | integer | not null default nextval('nhd100kstreams_gid_seq'::regclass)
objectid | integer |
comid | integer |
fdate | date |
resolution | integer |
gnis_id | character varying(10) |
gnis_name | character varying(65) |
lengthkm | numeric |
reachcode | character varying(14) |
flowdir | integer |
wbareacomi | integer |
ftype | integer |
fcode | integer |
shape_leng 

[postgis-users] st_linetocurve problem

2011-11-04 Thread Daniele Debernardi

hi,
i have a problem with this square geometry

MULTILINESTRING((
715476.865376701 92357.3121736097,
715476.712298064 92356.8361831361,
715477.188288538 92356.6831044996,
715477.341367174 92357.1590949732,
715476.865376701 92357.3121736097))

when i convert it with st_linetocurve i receive this geometry:

MULTICURVE(CIRCULARSTRING(
715476.865376701 92357.3121736097,
715477.188288538 92356.6831044996,
715476.865376701 92357.3121736097))

with the first point equal to the last point.

is this a bug? how can i fix it?

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Slow Query Times for Split Tiles

2011-11-04 Thread Pierre Racine
I think 2 minutes for 50x50 tiles is reasonable. If you want faster results 
reload your raster tiling it 10x10 using the -k option.

> -Original Message-
> From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-
> boun...@postgis.refractions.net] On Behalf Of elliott
> Sent: Friday, November 04, 2011 8:50 AM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Slow Query Times for Split Tiles
> 
> Yes, I am certain that the table has been indexed. If I use the 
> raster2pgsql.py
> script with the -I option to load additional data, it complains that the 
> table is
> already indexed.
> 
>   Querying for 10,000 lat/lons on locations that have not been imported
> into the database is very fast returning in 2 seconds.   It can very
> quickly determine that there are no intersections.  However, when there are
> intersections for all 10,000 lat/lons, the query is taking over 2 minutes.  
> Would
> you expect the performance to be better than that?
> 
> On 11/3/2011 4:32 PM, Pierre Racine wrote:
> >> Thanks for the suggestion.  From the output, it doesn't look like it
> >> is using the index even though an index was created with the
> >> raster2pgsql.py script.  Is there a specific way to tell the query to use 
> >> the
> index?
> >>
> > Could you make sure you really created an index on the 50x50 pixel table. I
> don't see the -I option in your raster2pgsql.py command...
> >
> > Pierre
> > ___
> > postgis-users mailing list
> > postgis-users@postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> >
> 
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Slow Query Times for Split Tiles

2011-11-04 Thread elliott
Yes, I am certain that the table has been indexed. If I use the 
raster2pgsql.py script with the -I option to load additional data, it 
complains that the table is already indexed.


 Querying for 10,000 lat/lons on locations that have not been imported 
into the database is very fast returning in 2 seconds.   It can very 
quickly determine that there are no intersections.  However, when there 
are intersections for all 10,000 lat/lons, the query is taking over 2 
minutes.  Would you expect the performance to be better than that?


On 11/3/2011 4:32 PM, Pierre Racine wrote:

Thanks for the suggestion.  From the output, it doesn't look like it is using 
the
index even though an index was created with the raster2pgsql.py script.  Is 
there
a specific way to tell the query to use the index?
 

Could you make sure you really created an index on the 50x50 pixel table. I 
don't see the -I option in your raster2pgsql.py command...

Pierre
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

   


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


[postgis-users] ERROR: Error performing intersection.

2011-11-04 Thread Andreas Forø Tollefsen
What could be the reason for this query:

SELECT p.gid, array_to_string(array_agg(distinct(CASE when (g.cowgroupid <>
0) then g.cowgroupid end)), ';'), CAST(SUM(ST_Area(ST_Intersection(p.cell,
g.geom))) AS decimal (6,4)), g.type, g.startyear, g.endyear
FROM priogrid p, "geoepr-20100212-1248" g WHERE ST_Intersects(p.cell,
g.geom) GROUP BY p.gid, g.cowgroupid, g.type, g.startyear, g.endyear;

to produce the following error?

ERROR:  Error performing intersection.

** Error **

ERROR: Error performing intersection.
SQL state: XX000

Andreas
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] how to get point, polygon value in postgis JDBC?

2011-11-04 Thread Maria Arias de Reyna
El Viernes, 4 de Noviembre de 2011, LeeHyung Joo - Kevin escribió:
> Hi all.
> 
> I have a big problem on my work.
> 
> First of all, the version based on my work is PostgreSQL : 8.1.11, PostGIS
> : 1.5.2.
> 
> The problem is I can not get the value of Point and Polygon.
> 
> The code is below :
> 
> blah blah
> --
> -- public Test find(String Id) throws DAOException,
> ClassNotFoundException, SQLException {
> Class.forName("org.postgresql.Driver");
> String url = "jdbc:postgresql://localhost:8080/testdao";
> Connection conn = DriverManager.getConnection(url, admin, apple);
> PreparedStatement ps = conn.prepareStatement("select name, addr,
> ST_AsText(position) from listFriend where id=?"); ps.setString(1, Id);
> 
> ResultSet rs = ps.executeQuery();
> 
> //   
> ((org.postgresql.PGConnection)conn).addDataType("point",Class.forName("org
> .postgis.Point"));
> 
> while(rs.next()) {
> test = new Test();
> p = new Point();
> 
> test.setName(rs.getString(1));
> test.setAddr(rs.getString(2));
> test.setPosition(p);
> //System.out.println(rs.getString(3));
> ---
> ---
> 
> If I run the sentence, "System.out.println(rs.getString(3));, it works. It
> prints right output like "Point(13.123 11.123)".
> 
> But I don't know how to fix test.setPosition(p);
> 
> any helps will be thank.
> 
> 
> P.S. is it possible to be reason why the low or high version of PostgreSQL
> and PostGIS?
> 
> 
> Best Regards
> Kevin Lee.


Maybe you are looking for a WKT Parser? Like this one:

http://www.vividsolutions.com/jts/javadoc/com/vividsolutions/jts/io/WKTReader.html

If you are using the postGIS driver you should be able to use the geometry 
directly on your query, instead of using as_text.

-- 
María Arias de Reyna Domínguez
Área de Operaciones

Emergya Consultoría 
Tfno: +34 954 51 75 77 / +34 607 43 74 27
Fax: +34 954 51 64 73 
www.emergya.com
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users