Re: [postgis-users] Performance issue with ST_INTERSECTS
Hi Paul, Thank you for your suggestion. I gave it a try and now the query is running in about 22 seconds. Amazing! If you have time, can you shed some light on why this makes such a difference? What is it about the spatial indexes of smaller regions that make them more effective? Thank you again, Shira - Original Message - From: "Paul Ramsey" To: "PostGIS Users Discussion" Sent: Thursday, November 3, 2011 3:26:24 PM Subject: Re: [postgis-users] Performance issue with ST_INTERSECTS Just from your description, my guess is that your few, large, regions with many vertices are the problem. Since they are static, I'd recommend cutting them up into a much larger number of small regions, by for example intersecting them with a regular grid. Then you can get much more effect from your indexes. P. On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel wrote: > Hello List, > > This is my first post so feel free to let me know if I'm missing any critical > info or if this is better suited for the pgsql-performance list. > > Essentially, I'm troubleshooting a performance issue with a spatial > intersection query. The query sums the lengths of line features that > intersect polygon features. The line layer is very dense. The polygon layer > only has 8 features, but each polygon does have a lot of points. > > It's taking almost 50 minutes to run and I'm wondering if the planner is > choosing a less than optimal query plan. One reason I'm thinking this is > based on a bad row estimate in the explain analyze output. Can anything be > done to resolve this bad estimate? Or is it unavoidable? And more > importantly: Is the query plan that the planner is choosing the fastest one > that it could use despite the bad row estimate? > > -Version Info- > Production Server: > PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu > 4.4.3-4ubuntu5) 4.4.3, 64-bit > POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " > LIBXML="2.7.6" USE_STATS > > We also tested the query against a test server with more recent postgres > product versions (the same query plan and bad row estimate occurred): > PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real > (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit > POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " > LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade) > > -SQL Statement- > SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * > 0.000621371192 AS milesum > FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, > n.the_geom) > WHERE n.fcode = 46003 > GROUP BY r.gid > > -Explain Analyze Output- > > See here: http://explain.depesz.com/s/Esx > > HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 > .455..2823674 .458 rows=8 loops=1) > -> Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual > time=23.893..675119.593 rows=90268 loops=1) > Join Filter: _st_intersects(r.the_geom, n.the_geom) > -> Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual > time=0.005..0.019 rows=8 loops=1) > -> Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n > (cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 > loops=8) > Index Cond: (r.the_geom && n.the_geom) > Filter: (n.fcode = 46003) > Total runtime: 2823674.517 ms > > ***Notice actual rows in the first index scan is 41631, but the estimate is > 1. > > -Additional Info- > - I've run vacuum analyze on the tables in question. That has not helped. > (Autovacuum does run regularly also.) > - In terms of history, this query has always been slow. > - Various configuration settings: > shared_buffers: 3GB (Machine RAM: 12GB) > work_mem: 2GB > maintenance_work_mem: 1GB > effective_cache_size: 2GB (this was originally set to the default; we > increased it to 2GB and restarted the server, but it didn't change the query > plan.) > - Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS. > > -Table and Index Schema- > > Table "public.psaregions" > Column | Type | Modifiers > +---+-- > > gid | integer | not null default nextval('psaregions_gid_seq'::regclass) > psa_region | character varying(50) | > np_length | numeric | > xxx | numeric | > the_geom | geometry | > sm_geom | geometry | > miles46006 | numeric | > miles46003 | numeric | > llextent | character varying | > Indexes: > "psaregions_pkey" PRIMARY KEY, btree (gid) > "psaregions_region_idx" btree (psa_region) > "psaregions_sm_geom_gist" gist (sm_geom) > "psaregions_the_geom_gist" gist (the_geom) > > > Table "public.nhd100kstreams" > Column | Type | Modifiers > +---+---
Re: [postgis-users] Slow Query Times for Split Tiles
On 4 November 2011 12:50, elliott wrote: > Yes, I am certain that the table has been indexed. If I use the > raster2pgsql.py script with the -I option to load additional data, it > complains that the table is already indexed. > > Querying for 10,000 lat/lons on locations that have not been imported into > the database is very fast returning in 2 seconds. It can very quickly > determine that there are no intersections. However, when there are > intersections for all 10,000 lat/lons, the query is taking over 2 minutes. Do you store rasters in- or out-database? Anyway, query itself is fast, I'm sure. But deserialization of 1 rasters to find and fetch value of single pixel, that sounds expansive on its own. Try to imagine, or prototype it with GDAL and Python, opening 1 .tif files, one by one and RasterIO'ing value of pixel. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Performance issue with ST_INTERSECTS
This should be making use of PreparedGeometry, right? Meaning that the actual intersection computation should be fairly performant? So is it just the fact that the index is not very selective that is causing the performance issue? Just asking in case this is revealing some kind of issue with the PreparedGeometry implementation... Martin On 11/3/2011 3:26 PM, Paul Ramsey wrote: Just from your description, my guess is that your few, large, regions with many vertices are the problem. Since they are static, I'd recommend cutting them up into a much larger number of small regions, by for example intersecting them with a regular grid. Then you can get much more effect from your indexes. P. On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel wrote: Hello List, This is my first post so feel free to let me know if I'm missing any critical info or if this is better suited for the pgsql-performance list. Essentially, I'm troubleshooting a performance issue with a spatial intersection query. The query sums the lengths of line features that intersect polygon features. The line layer is very dense. The polygon layer only has 8 features, but each polygon does have a lot of points. It's taking almost 50 minutes to run and I'm wondering if the planner is choosing a less than optimal query plan. One reason I'm thinking this is based on a bad row estimate in the explain analyze output. Can anything be done to resolve this bad estimate? Or is it unavoidable? And more importantly: Is the query plan that the planner is choosing the fastest one that it could use despite the bad row estimate? -Version Info- Production Server: PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.6" USE_STATS We also tested the query against a test server with more recent postgres product versions (the same query plan and bad row estimate occurred): PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade) -SQL Statement- SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * 0.000621371192 AS milesum FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, n.the_geom) WHERE n.fcode = 46003 GROUP BY r.gid -Explain Analyze Output- See here: http://explain.depesz.com/s/Esx HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 .455..2823674 .458 rows=8 loops=1) -> Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual time=23.893..675119.593 rows=90268 loops=1) Join Filter: _st_intersects(r.the_geom, n.the_geom) -> Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual time=0.005..0.019 rows=8 loops=1) -> Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n (cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 loops=8) Index Cond: (r.the_geom&& n.the_geom) Filter: (n.fcode = 46003) Total runtime: 2823674.517 ms ***Notice actual rows in the first index scan is 41631, but the estimate is 1. -Additional Info- - I've run vacuum analyze on the tables in question. That has not helped. (Autovacuum does run regularly also.) - In terms of history, this query has always been slow. - Various configuration settings: shared_buffers: 3GB (Machine RAM: 12GB) work_mem: 2GB maintenance_work_mem: 1GB effective_cache_size: 2GB (this was originally set to the default; we increased it to 2GB and restarted the server, but it didn't change the query plan.) - Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS. -Table and Index Schema- Table "public.psaregions" Column | Type | Modifiers +---+-- gid | integer | not null default nextval('psaregions_gid_seq'::regclass) psa_region | character varying(50) | np_length | numeric | xxx | numeric | the_geom | geometry | sm_geom | geometry | miles46006 | numeric | miles46003 | numeric | llextent | character varying | Indexes: "psaregions_pkey" PRIMARY KEY, btree (gid) "psaregions_region_idx" btree (psa_region) "psaregions_sm_geom_gist" gist (sm_geom) "psaregions_the_geom_gist" gist (the_geom) Table "public.nhd100kstreams" Column | Type | Modifiers +---+-- gid | integer | not null default nextval('nhd100kstreams_gid_seq'::regclass) objectid | integer | comid | integer | fdate | date | resolution | integer | gnis_id | character varying(10) | gnis_name | character varying(65) | lengthkm | numeric | reachcode | character varying(14) | flowdir | integer | wbareacomi | integer | ftype | integer | fcode | integer | shape_leng
[postgis-users] st_linetocurve problem
hi, i have a problem with this square geometry MULTILINESTRING(( 715476.865376701 92357.3121736097, 715476.712298064 92356.8361831361, 715477.188288538 92356.6831044996, 715477.341367174 92357.1590949732, 715476.865376701 92357.3121736097)) when i convert it with st_linetocurve i receive this geometry: MULTICURVE(CIRCULARSTRING( 715476.865376701 92357.3121736097, 715477.188288538 92356.6831044996, 715476.865376701 92357.3121736097)) with the first point equal to the last point. is this a bug? how can i fix it? ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Slow Query Times for Split Tiles
I think 2 minutes for 50x50 tiles is reasonable. If you want faster results reload your raster tiling it 10x10 using the -k option. > -Original Message- > From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users- > boun...@postgis.refractions.net] On Behalf Of elliott > Sent: Friday, November 04, 2011 8:50 AM > To: PostGIS Users Discussion > Subject: Re: [postgis-users] Slow Query Times for Split Tiles > > Yes, I am certain that the table has been indexed. If I use the > raster2pgsql.py > script with the -I option to load additional data, it complains that the > table is > already indexed. > > Querying for 10,000 lat/lons on locations that have not been imported > into the database is very fast returning in 2 seconds. It can very > quickly determine that there are no intersections. However, when there are > intersections for all 10,000 lat/lons, the query is taking over 2 minutes. > Would > you expect the performance to be better than that? > > On 11/3/2011 4:32 PM, Pierre Racine wrote: > >> Thanks for the suggestion. From the output, it doesn't look like it > >> is using the index even though an index was created with the > >> raster2pgsql.py script. Is there a specific way to tell the query to use > >> the > index? > >> > > Could you make sure you really created an index on the 50x50 pixel table. I > don't see the -I option in your raster2pgsql.py command... > > > > Pierre > > ___ > > postgis-users mailing list > > postgis-users@postgis.refractions.net > > http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Slow Query Times for Split Tiles
Yes, I am certain that the table has been indexed. If I use the raster2pgsql.py script with the -I option to load additional data, it complains that the table is already indexed. Querying for 10,000 lat/lons on locations that have not been imported into the database is very fast returning in 2 seconds. It can very quickly determine that there are no intersections. However, when there are intersections for all 10,000 lat/lons, the query is taking over 2 minutes. Would you expect the performance to be better than that? On 11/3/2011 4:32 PM, Pierre Racine wrote: Thanks for the suggestion. From the output, it doesn't look like it is using the index even though an index was created with the raster2pgsql.py script. Is there a specific way to tell the query to use the index? Could you make sure you really created an index on the 50x50 pixel table. I don't see the -I option in your raster2pgsql.py command... Pierre ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
[postgis-users] ERROR: Error performing intersection.
What could be the reason for this query: SELECT p.gid, array_to_string(array_agg(distinct(CASE when (g.cowgroupid <> 0) then g.cowgroupid end)), ';'), CAST(SUM(ST_Area(ST_Intersection(p.cell, g.geom))) AS decimal (6,4)), g.type, g.startyear, g.endyear FROM priogrid p, "geoepr-20100212-1248" g WHERE ST_Intersects(p.cell, g.geom) GROUP BY p.gid, g.cowgroupid, g.type, g.startyear, g.endyear; to produce the following error? ERROR: Error performing intersection. ** Error ** ERROR: Error performing intersection. SQL state: XX000 Andreas ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] how to get point, polygon value in postgis JDBC?
El Viernes, 4 de Noviembre de 2011, LeeHyung Joo - Kevin escribió: > Hi all. > > I have a big problem on my work. > > First of all, the version based on my work is PostgreSQL : 8.1.11, PostGIS > : 1.5.2. > > The problem is I can not get the value of Point and Polygon. > > The code is below : > > blah blah > -- > -- public Test find(String Id) throws DAOException, > ClassNotFoundException, SQLException { > Class.forName("org.postgresql.Driver"); > String url = "jdbc:postgresql://localhost:8080/testdao"; > Connection conn = DriverManager.getConnection(url, admin, apple); > PreparedStatement ps = conn.prepareStatement("select name, addr, > ST_AsText(position) from listFriend where id=?"); ps.setString(1, Id); > > ResultSet rs = ps.executeQuery(); > > // > ((org.postgresql.PGConnection)conn).addDataType("point",Class.forName("org > .postgis.Point")); > > while(rs.next()) { > test = new Test(); > p = new Point(); > > test.setName(rs.getString(1)); > test.setAddr(rs.getString(2)); > test.setPosition(p); > //System.out.println(rs.getString(3)); > --- > --- > > If I run the sentence, "System.out.println(rs.getString(3));, it works. It > prints right output like "Point(13.123 11.123)". > > But I don't know how to fix test.setPosition(p); > > any helps will be thank. > > > P.S. is it possible to be reason why the low or high version of PostgreSQL > and PostGIS? > > > Best Regards > Kevin Lee. Maybe you are looking for a WKT Parser? Like this one: http://www.vividsolutions.com/jts/javadoc/com/vividsolutions/jts/io/WKTReader.html If you are using the postGIS driver you should be able to use the geometry directly on your query, instead of using as_text. -- María Arias de Reyna Domínguez Área de Operaciones Emergya Consultoría Tfno: +34 954 51 75 77 / +34 607 43 74 27 Fax: +34 954 51 64 73 www.emergya.com ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users