Re: [postgis-users] Increase query performance

2011-03-09 Thread Mark Cave-Ayland

On 08/03/11 17:01, Andreas Forø Tollefsen wrote:


After a suggestion from pgsql_performance i tried with ST_Simplify to
speed things up.
However this gives me a:
NOTICE: ptarray_simplify returned a 2 pts array

Then server connection terminates.

Like this:

SELECT ST_Intersection(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AS geom,
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate,
enddate, capname, caplong, caplat, col, row, xcoord, ycoord
FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear =1946 AND
cshapeswdate.gweyear =1946 AND cshapeswdate.startdate = '1946/1/1';


Yup that's a bug related to the new code in trunk - I think we've 
already got something similar filed in the bug tracker. This is probably 
going to require a Paul to take a look at it.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Increase query performance

2011-03-09 Thread Andreas Forø Tollefsen
Thanks for all the suggestions.
I first thought about doing the simplify, but after some reconsideration I
realized that this would change the geometry of the country borders, and
Area measurements might be wrong.

Then i tried the query without the ST_Intersection and only the conditional
ST_Intersects.
This completed in 87 seconds, so this was very fast compared to 40-50
minutes per year.

I also tried the UNION ALL suggestion by Nicklas which had almost similar
performance as the original query. I will test this some more to see whether
i can save some time.

My guess is that the calculation time required to create the new intersected
geometry is what takes so long, and this is probably something i need to
accept (or get more cpu power :)

Andreas

2011/3/9 Mark Cave-Ayland mark.cave-ayl...@siriusit.co.uk

 On 08/03/11 17:01, Andreas Forø Tollefsen wrote:

  After a suggestion from pgsql_performance i tried with ST_Simplify to
 speed things up.
 However this gives me a:
 NOTICE: ptarray_simplify returned a 2 pts array

 Then server connection terminates.

 Like this:

 SELECT ST_Intersection(priogrid_land.cell,
 ST_Simplify(cshapeswdate.geom,0.1)) AS geom,
 priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate,
 enddate, capname, caplong, caplat, col, row, xcoord, ycoord
 FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
 ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear =1946 AND
 cshapeswdate.gweyear =1946 AND cshapeswdate.startdate = '1946/1/1';


 Yup that's a bug related to the new code in trunk - I think we've already
 got something similar filed in the bug tracker. This is probably going to
 require a Paul to take a look at it.


 ATB,

 Mark.

 --
 Mark Cave-Ayland - Senior Technical Architect
 PostgreSQL - PostGIS
 Sirius Corporation plc - control through freedom
 http://www.siriusit.co.uk
 t: +44 870 608 0063

 Sirius Labs: http://www.siriusit.co.uk/labs

 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


[postgis-users] Increase query performance

2011-03-08 Thread Andreas Forø Tollefsen
Hi all,

Another question on postgis query performance. I did some discussion on the
pgsql performance list on how i could increase the performance on my server.
I did manage to increase the transactions per second, but came to the
conclusion that this did not help the ST_Intersection query which i was
trying to speed up.

Any suggestions on how to speed up this query? Basically i want to create an
intersection between a vector grid and country shapefiles. Then calculate
the area of all the polygons in the intersected table, and finally selecting
the country code for each cell which represents the largest area within each
cell. I does work as it is, but i would like to increase the speed. The
query which takes a lot of time is the ST_Intersection.

PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit

POSTGIS=2.0.0SVN GEOS=3.2.2-CAPI-1.6.2 PROJ=Rel. 4.7.1, 23 September
2009 LIBXML=2.7.7 USE_STATS


DROP TABLE IF EXISTS cshapesgrid1946;

SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
cshapeswdate.geom) AND cshapeswdate.gwsyear =1946 AND cshapeswdate.gweyear
=1946 AND cshapeswdate.startdate = '1946/1/1';

ALTER TABLE cshapesgrid1946 ADD COLUMN area float;

UPDATE cshapesgrid1946 SET area = ST_Area(cshapesgrid1946.geom);

DROP TABLE IF EXISTS pg1946;

SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
MAX(area) FROM cshapesgrid1946 b GROUP BY divider);

CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);


Best,
Andreas
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Increase query performance

2011-03-08 Thread Paragon Corporation
Andreas,
I don't see anything glaringly wrong with your query, but hard to tell
without seeing an explain plan or what you mean by takes a lot of time.  Is
a lot of time minutes, hours, days?
How many records are we talking about here?  What's the max number of points
you have in any geometry.  Often times its just one huge mega geometry with
a 100,000 points or more slowing everything down.
 
Do you have indexes on your year fields and date columns?
 
Hope that helps,
Regina 
http://www.postgis.us
 

  _  

From: postgis-users-boun...@postgis.refractions.net
[mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Andreas
Forø Tollefsen
Sent: Tuesday, March 08, 2011 4:15 AM
To: PostGIS Users Discussion
Subject: [postgis-users] Increase query performance


Hi all, 

Another question on postgis query performance. I did some discussion on the
pgsql performance list on how i could increase the performance on my server.
I did manage to increase the transactions per second, but came to the
conclusion that this did not help the ST_Intersection query which i was
trying to speed up.

Any suggestions on how to speed up this query? Basically i want to create an
intersection between a vector grid and country shapefiles. Then calculate
the area of all the polygons in the intersected table, and finally selecting
the country code for each cell which represents the largest area within each
cell. I does work as it is, but i would like to increase the speed. The
query which takes a lot of time is the ST_Intersection.

PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit

POSTGIS=2.0.0SVN GEOS=3.2.2-CAPI-1.6.2 PROJ=Rel. 4.7.1, 23 September
2009 LIBXML=2.7.7 USE_STATS


DROP TABLE IF EXISTS cshapesgrid1946;

SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
cshapeswdate.geom) AND cshapeswdate.gwsyear =1946 AND cshapeswdate.gweyear
=1946 AND cshapeswdate.startdate = '1946/1/1';

ALTER TABLE cshapesgrid1946 ADD COLUMN area float;

UPDATE cshapesgrid1946 SET area = ST_Area(cshapesgrid1946.geom);

DROP TABLE IF EXISTS pg1946;

SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
MAX(area) FROM cshapesgrid1946 b GROUP BY divider);

CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);


Best,
Andreas
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Increase query performance

2011-03-08 Thread Andreas Forø Tollefsen
Hi,
This query takes about 41 minutes per year. Doing this for every year from
1946 to 2008 takes a lot of time.
The grid consists of 64818 cell polygons, while the country table has about
210 polygons.

I will add indexes on year and date and try again. An do an explain analyze.

Cheers,
Andreas

2011/3/8 Paragon Corporation l...@pcorp.us

  Andreas,
 I don't see anything glaringly wrong with your query, but hard to tell
 without seeing an explain plan or what you mean by takes a lot of time.  Is
 a lot of time minutes, hours, days?
 How many records are we talking about here?  What's the max number of
 points you have in any geometry.  Often times its just one huge mega
 geometry with a 100,000 points or more slowing everything down.

 Do you have indexes on your year fields and date columns?

 Hope that helps,
 Regina
 http://www.postgis.us


  --
 *From:* postgis-users-boun...@postgis.refractions.net [mailto:
 postgis-users-boun...@postgis.refractions.net] *On Behalf Of *Andreas Forø
 Tollefsen
 *Sent:* Tuesday, March 08, 2011 4:15 AM
 *To:* PostGIS Users Discussion
 *Subject:* [postgis-users] Increase query performance

 Hi all,

 Another question on postgis query performance. I did some discussion on the
 pgsql performance list on how i could increase the performance on my server.
 I did manage to increase the transactions per second, but came to the
 conclusion that this did not help the ST_Intersection query which i was
 trying to speed up.

 Any suggestions on how to speed up this query? Basically i want to create
 an intersection between a vector grid and country shapefiles. Then calculate
 the area of all the polygons in the intersected table, and finally selecting
 the country code for each cell which represents the largest area within each
 cell. I does work as it is, but i would like to increase the speed. The
 query which takes a lot of time is the ST_Intersection.

 PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
 (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit

 POSTGIS=2.0.0SVN GEOS=3.2.2-CAPI-1.6.2 PROJ=Rel. 4.7.1, 23 September
 2009 LIBXML=2.7.7 USE_STATS


  DROP TABLE IF EXISTS cshapesgrid1946;

 SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
 priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
 capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
 priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
 cshapeswdate.geom) AND cshapeswdate.gwsyear =1946 AND cshapeswdate.gweyear
 =1946 AND cshapeswdate.startdate = '1946/1/1';

 ALTER TABLE cshapesgrid1946 ADD COLUMN area float;

 UPDATE cshapesgrid1946 SET area = ST_Area(cshapesgrid1946.geom);

 DROP TABLE IF EXISTS pg1946;

 SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
 MAX(area) FROM cshapesgrid1946 b GROUP BY divider);

 CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);


 Best,
 Andreas

 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Increase query performance

2011-03-08 Thread Andreas Forø Tollefsen
Results from the EXPLAIN ANALYZE:

Nested Loop  (cost=0.00..1189.72 rows=3941 width=87790) (actual
time=7.091..2524830.264 rows=54145 loops=1)
  Join Filter: _st_intersects(priogrid_land.cell, cshapeswdate.geom)
  -  Seq Scan on cshapeswdate  (cost=0.00..16.23 rows=22 width=87304)
(actual time=0.011..0.542 rows=72 loops=1)
Filter: ((gwsyear = 1946::numeric) AND (gweyear = 1946::numeric)
AND (startdate = '1946-01-01'::date))
  -  Index Scan using idx_priogrid_land_cell on priogrid_land
 (cost=0.00..8.29 rows=1 width=486) (actual time=3.026..30.152 rows=1338
loops=72)
Index Cond: (priogrid_land.cell  cshapeswdate.geom)
Total runtime: 2524889.630 ms


2011/3/8 Andreas Forø Tollefsen andrea...@gmail.com

 Hi,
 This query takes about 41 minutes per year. Doing this for every year from
 1946 to 2008 takes a lot of time.
 The grid consists of 64818 cell polygons, while the country table has about
 210 polygons.

 I will add indexes on year and date and try again. An do an explain
 analyze.

 Cheers,
 Andreas

 2011/3/8 Paragon Corporation l...@pcorp.us

  Andreas,
 I don't see anything glaringly wrong with your query, but hard to tell
 without seeing an explain plan or what you mean by takes a lot of time.  Is
 a lot of time minutes, hours, days?
 How many records are we talking about here?  What's the max number of
 points you have in any geometry.  Often times its just one huge mega
 geometry with a 100,000 points or more slowing everything down.

 Do you have indexes on your year fields and date columns?

 Hope that helps,
 Regina
 http://www.postgis.us


  --
 *From:* postgis-users-boun...@postgis.refractions.net [mailto:
 postgis-users-boun...@postgis.refractions.net] *On Behalf Of *Andreas
 Forø Tollefsen
 *Sent:* Tuesday, March 08, 2011 4:15 AM
 *To:* PostGIS Users Discussion
 *Subject:* [postgis-users] Increase query performance

 Hi all,

 Another question on postgis query performance. I did some discussion on
 the pgsql performance list on how i could increase the performance on my
 server.
 I did manage to increase the transactions per second, but came to the
 conclusion that this did not help the ST_Intersection query which i was
 trying to speed up.

 Any suggestions on how to speed up this query? Basically i want to create
 an intersection between a vector grid and country shapefiles. Then calculate
 the area of all the polygons in the intersected table, and finally selecting
 the country code for each cell which represents the largest area within each
 cell. I does work as it is, but i would like to increase the speed. The
 query which takes a lot of time is the ST_Intersection.

 PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
 (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit

 POSTGIS=2.0.0SVN GEOS=3.2.2-CAPI-1.6.2 PROJ=Rel. 4.7.1, 23 September
 2009 LIBXML=2.7.7 USE_STATS


  DROP TABLE IF EXISTS cshapesgrid1946;

 SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
 priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
 capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
 priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
 cshapeswdate.geom) AND cshapeswdate.gwsyear =1946 AND cshapeswdate.gweyear
 =1946 AND cshapeswdate.startdate = '1946/1/1';

 ALTER TABLE cshapesgrid1946 ADD COLUMN area float;

 UPDATE cshapesgrid1946 SET area = ST_Area(cshapesgrid1946.geom);

 DROP TABLE IF EXISTS pg1946;

 SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
 MAX(area) FROM cshapesgrid1946 b GROUP BY divider);

 CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);


 Best,
 Andreas

 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users



___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Increase query performance

2011-03-08 Thread Paragon Corporation
Andreas,
 
Try ST_SimplifyPreserveTolerance.  The ST_Simplify often simplifies to
nothing or close.
 
Like I mentioned in the other post, it could be some huge geometries causing
your problems.  You don't necessarily want to simplify everything.
 
We usually do a conditional simplify like
 
CASE WHEN ST_NPoints(geom)  3000 THEN ST_SimplyfyPreserveTopology(geom,0.1)
ELSE geom END
 
 
It probably wouldn't hurt to do a max check on your tables or a count to see
how many have more than n number of points.
 
SELECT MAX(ST_NPoints(geom)) As biggest, COUNT(CASE WHEN ST_NPoints(geom) 
3000 THEN 1 ELSE NULL END) as cnt_big_geoms
FROM yourtable
 
To get a sense of the largest geometry you are dealing with.
 
Hope that helps,
Regina
http://www.postgis.us

  _  

From: Andreas Forø Tollefsen [mailto:andrea...@gmail.com] 
Sent: Tuesday, March 08, 2011 12:02 PM
To: PostGIS Users Discussion
Cc: Paragon Corporation
Subject: Re: [postgis-users] Increase query performance


After a suggestion from pgsql_performance i tried with ST_Simplify to speed
things up. 
However this gives me a:
NOTICE: ptarray_simplify returned a 2 pts array

Then server connection terminates.


Like this:  

SELECT ST_Intersection(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AS geom, 
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord 
FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear =1946 AND
cshapeswdate.gweyear =1946 AND cshapeswdate.startdate = '1946/1/1';


2011/3/8 Andreas Forø Tollefsen andrea...@gmail.com


Results from the EXPLAIN ANALYZE: 

Nested Loop  (cost=0.00..1189.72 rows=3941 width=87790) (actual
time=7.091..2524830.264 rows=54145 loops=1)
  Join Filter: _st_intersects(priogrid_land.cell, cshapeswdate.geom)
  -  Seq Scan on cshapeswdate  (cost=0.00..16.23 rows=22 width=87304)
(actual time=0.011..0.542 rows=72 loops=1)
Filter: ((gwsyear = 1946::numeric) AND (gweyear = 1946::numeric)
AND (startdate = '1946-01-01'::date))
  -  Index Scan using idx_priogrid_land_cell on priogrid_land
(cost=0.00..8.29 rows=1 width=486) (actual time=3.026..30.152 rows=1338
loops=72)
Index Cond: (priogrid_land.cell  cshapeswdate.geom)
Total runtime: 2524889.630 ms


2011/3/8 Andreas Forø Tollefsen andrea...@gmail.com 


Hi, 
This query takes about 41 minutes per year. Doing this for every year from
1946 to 2008 takes a lot of time.
The grid consists of 64818 cell polygons, while the country table has about
210 polygons.

I will add indexes on year and date and try again. An do an explain analyze.

Cheers,
Andreas


2011/3/8 Paragon Corporation l...@pcorp.us


Andreas,
I don't see anything glaringly wrong with your query, but hard to tell
without seeing an explain plan or what you mean by takes a lot of time.  Is
a lot of time minutes, hours, days?
How many records are we talking about here?  What's the max number of points
you have in any geometry.  Often times its just one huge mega geometry with
a 100,000 points or more slowing everything down.
 
Do you have indexes on your year fields and date columns?
 
Hope that helps,
Regina 
http://www.postgis.us
 

  _  

From: postgis-users-boun...@postgis.refractions.net
[mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Andreas
Forø Tollefsen
Sent: Tuesday, March 08, 2011 4:15 AM
To: PostGIS Users Discussion
Subject: [postgis-users] Increase query performance


Hi all, 

Another question on postgis query performance. I did some discussion on the
pgsql performance list on how i could increase the performance on my server.
I did manage to increase the transactions per second, but came to the
conclusion that this did not help the ST_Intersection query which i was
trying to speed up.

Any suggestions on how to speed up this query? Basically i want to create an
intersection between a vector grid and country shapefiles. Then calculate
the area of all the polygons in the intersected table, and finally selecting
the country code for each cell which represents the largest area within each
cell. I does work as it is, but i would like to increase the speed. The
query which takes a lot of time is the ST_Intersection.

PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit

POSTGIS=2.0.0SVN GEOS=3.2.2-CAPI-1.6.2 PROJ=Rel. 4.7.1, 23 September
2009 LIBXML=2.7.7 USE_STATS


DROP TABLE IF EXISTS cshapesgrid1946;

SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
cshapeswdate.geom) AND cshapeswdate.gwsyear =1946 AND cshapeswdate.gweyear
=1946 AND cshapeswdate.startdate = '1946/1/1';

ALTER TABLE cshapesgrid1946 ADD COLUMN