Re: [postgis-users] No index usage on geography query plan?
On Mon, May 31, 2010 at 10:27 PM, Paragon Corporation l...@pcorp.us wrote: On that thought. Remember how geometry intersects performance significantly increased with prepared geometry algorithm, are we using that same kind of prepared geometry logic for geography. No, we are not. The algorithm is currently brute force O(nm). My ideas are very similar to the prepared geometry approach, but with some changes necessary to work in spherical land (like, what are the index leaves? not rectangles...) P. ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Perhaps it could use an in-memory bounding prism index? You're using a disk-based one used for geography types, right? Paul Ramsey wrote: On Mon, May 31, 2010 at 10:27 PM, Paragon Corporation l...@pcorp.us wrote: On that thought. Remember how geometry intersects performance significantly increased with prepared geometry algorithm, are we using that same kind of prepared geometry logic for geography. No, we are not. The algorithm is currently brute force O(nm). My ideas are very similar to the prepared geometry approach, but with some changes necessary to work in spherical land (like, what are the index leaves? not rectangles...) P. ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users -- Martin Davis Senior Technical Architect Refractions Research, Inc. (250) 383-3022 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Bounding cube you mean? Could do. The problem with that is, it turns out that computing the bounding cube is actually the most computationally intensive part of the geography code! So we don't really want to do that for every edge of a shape. My approach is to use bounding circles for each edge. I haven't given any thought to whether they would be useful in the large for disk-based indexes (and we have, in any event, a perfectly workable bounding cube implementation), but for the in-memory problem they seem to make sense. P. On Tue, Jun 1, 2010 at 11:38 AM, Martin Davis mbda...@refractions.net wrote: Perhaps it could use an in-memory bounding prism index? You're using a disk-based one used for geography types, right? Paul Ramsey wrote: On Mon, May 31, 2010 at 10:27 PM, Paragon Corporation l...@pcorp.us wrote: On that thought. Remember how geometry intersects performance significantly increased with prepared geometry algorithm, are we using that same kind of prepared geometry logic for geography. No, we are not. The algorithm is currently brute force O(nm). My ideas are very similar to the prepared geometry approach, but with some changes necessary to work in spherical land (like, what are the index leaves? not rectangles...) P. ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users -- Martin Davis Senior Technical Architect Refractions Research, Inc. (250) 383-3022 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Paul Ramsey wrote: FYI, it is not the index that is slower, it is the op. The index is actually (surprisingly) faster. Wow, that's unexpected. Do you know why this is? Is there a message there for the current 2D GIST index? -- Martin Davis Senior Technical Architect Refractions Research, Inc. (250) 383-3022 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; _ From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText('SRID=4326;POLYGON((104.76 5625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E 06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315 A402F127C4A8C8343C0'::geography) Filter: (quicklook AND (_st_distance(swath_bounding, '010320E610010005315A402F127C4A8C8343C0E 0696 0402F127C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C65 26A2FC000315A402F127C4A8C8343C0'::geography, 0::double pre cision, false) 1e-05::double precision)) - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.91 rows=10347 width=0) (actual time=8.148..8.148 rows=1 4261 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E 06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC00 0315A402F127C4A8C8343C0'::geography) Total runtime: 12886.287 ms (7 rows) 2) Geometry ST_Intersects gives 1s :-) wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata_old wastac-# WHERE quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeomFromText('POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))', -1)); Aggregate (cost=9505.13..9505.14 rows=1 width=0) (actual time=95.681..95.682 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata_old (cost=506.77..9503.27 rows=745 width=0) (actual time=4.198..93.366 rows=7274 loops=1) Recheck Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402 F127C4A8C8343C0E 0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127 C4A8C8343C0'::geometry) Filter: (quicklook AND _st_intersects(swath_bounding, '010300010005315A402F127C4A8C8343C0E06960402 F12 7C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC0 00315A402F127C4A8C8343C0'::geometry)) - Bitmap Index Scan on t_swath_metadata_old_swath_bounding_key (cost=0.00..506.58 rows=16840 width=0) (actual time=3.557..3.557 ro ws=9020 loops=1) Index Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402 F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402 F127C4A8C8343C0'::geometry) Total runtime: 95.757 ms (7 rows) 3) Geography bounding box 1s: wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND swath_bounding ST_GeographyFromText('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))'); Aggregate (cost=10948.03
Re: [postgis-users] No index usage on geography query plan?
Actually it's slower because the calculations require lots of transcendental math. Anyhow, it's slower. If anyone wants a quote on speed improvements, I'm happy to provide one, I have some good ideas about how to speed things up with some better data structures and caching. P. On Mon, May 31, 2010 at 2:48 PM, Paragon Corporation l...@pcorp.us wrote: Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Filter: (quicklook AND (_st_distance(swath_bounding, '010320E610010005315A402F127C4A8C8343C0E0696 0402F127C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography, 0::double pre cision, false) 1e-05::double precision)) - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.91 rows=10347 width=0) (actual time=8.148..8.148 rows=1 4261 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Total runtime: 12886.287 ms (7 rows) 2) Geometry ST_Intersects gives 1s :-) wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata_old wastac-# WHERE quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeomFromText('POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))', -1)); Aggregate (cost=9505.13..9505.14 rows=1 width=0) (actual time=95.681..95.682 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata_old (cost=506.77..9503.27 rows=745 width=0) (actual time=4.198..93.366 rows=7274 loops=1) Recheck Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C0E 0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry) Filter: (quicklook AND _st_intersects(swath_bounding, '010300010005315A402F127C4A8C8343C0E06960402F12 7C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry)) - Bitmap Index Scan on t_swath_metadata_old_swath_bounding_key (cost=0.00..506.58 rows=16840 width=0) (actual time=3.557..3.557 ro ws=9020 loops=1) Index Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry) Total runtime: 95.757 ms
Re: [postgis-users] No index usage on geography query plan?
In the meantime perhaps someone could add a warning to the docs that geography indexes are an order of magnitude slower than geometries for intersections? It's pretty important stuff for anyone appraising a migration. Actually I was after containment (db polygons enclose search ROI) but that's a feature yet to come I know. No complaints about on the basis it's free, but people have to admit 10s for intersecting 150k polygons is not stellar given how we've been spoiled with geometries in the past. Anyway, maybe a warning is in order for people considering moving across what do you think? On 1 June 2010 07:00, Paul Ramsey pram...@opengeo.org wrote: Actually it's slower because the calculations require lots of transcendental math. Anyhow, it's slower. If anyone wants a quote on speed improvements, I'm happy to provide one, I have some good ideas about how to speed things up with some better data structures and caching. P. On Mon, May 31, 2010 at 2:48 PM, Paragon Corporation l...@pcorp.us wrote: Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Filter: (quicklook AND (_st_distance(swath_bounding, '010320E610010005315A402F127C4A8C8343C0E0696 0402F127C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography, 0::double pre cision, false) 1e-05::double precision)) - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.91 rows=10347 width=0) (actual time=8.148..8.148 rows=1 4261 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Total runtime: 12886.287 ms (7 rows) 2) Geometry ST_Intersects gives 1s :-) wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata_old wastac-# WHERE quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeomFromText('POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))', -1)); Aggregate (cost=9505.13..9505.14 rows=1 width=0) (actual time=95.681..95.682 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata_old (cost=506.77..9503.27 rows=745 width=0) (actual time=4.198..93.366 rows=7274 loops=1) Recheck Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C0E
Re: [postgis-users] No index usage on geography query plan?
FYI, it is not the index that is slower, it is the op. The index is actually (surprisingly) faster. P On May 31, 2010, at 5:06 PM, Nicholas Bower n...@petangent.net wrote: In the meantime perhaps someone could add a warning to the docs that geography indexes are an order of magnitude slower than geometries for intersections? It's pretty important stuff for anyone appraising a migration. Actually I was after containment (db polygons enclose search ROI) but that's a feature yet to come I know. No complaints about on the basis it's free, but people have to admit 10s for intersecting 150k polygons is not stellar given how we've been spoiled with geometries in the past. Anyway, maybe a warning is in order for people considering moving across what do you think? On 1 June 2010 07:00, Paul Ramsey pram...@opengeo.org wrote: Actually it's slower because the calculations require lots of transcendental math. Anyhow, it's slower. If anyone wants a quote on speed improvements, I'm happy to provide one, I have some good ideas about how to speed things up with some better data structures and caching. P. On Mon, May 31, 2010 at 2:48 PM, Paragon Corporation l...@pcorp.us wrote: Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText ('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0' ::geography) Filter: (quicklook AND (_st_distance(swath_bounding, '010320E610010005315A402F127C4A8C8343C0E0696 0402F127C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0' ::geography, 0::double pre cision, false) 1e-05::double precision)) - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.91 rows=10347 width=0) (actual time=8.148..8.148 rows=1 4261 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0' ::geography) Total runtime: 12886.287 ms (7 rows) 2) Geometry ST_Intersects gives 1s :-) wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata_old wastac-# WHERE quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeomFromText('POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))', -1)); Aggregate (cost=9505.13..9505.14 rows=1 width=0) (actual time=95.681..95.682 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata_old (cost=506.77..9503.27 rows=745 width=0) (actual time=4.198..93.366 rows=7274 loops=1) Recheck
Re: [postgis-users] No index usage on geography query plan?
Paul, On that thought. Remember how geometry intersects performance significantly increased with prepared geometry algorithm, are we using that same kind of prepared geometry logic for geography. Just thinking out loud that aside from the slower algorithm, for cases like these where thousands of records need to be checked by the non-index check, we are losing performance there too. I imagine that may be an even easier enhancement. Not sure it makes a difference in this particular case since Nicholas bounding constant geography is pretty simple. Nicholas -- as Paul stated -- its not the index that is orders of magnitude slower, its the secondary check. For most use cases (needle in a hay stack where you are trying to get rid of 1,000,000 records and check 100 or so candidates, the speed is pretty decent and not too far off from geometry). Decent once we fix the ST_Intersects minor bug that prevents the index from being used. Though we should have a caveat somewhere explaining in detail these scenarios. Hope that helps, Regina _ From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Paul Ramsey Sent: Monday, May 31, 2010 9:20 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? FYI, it is not the index that is slower, it is the op. The index is actually (surprisingly) faster. P On May 31, 2010, at 5:06 PM, Nicholas Bower n...@petangent.net wrote: In the meantime perhaps someone could add a warning to the docs that geography indexes are an order of magnitude slower than geometries for intersections? It's pretty important stuff for anyone appraising a migration. Actually I was after containment (db polygons enclose search ROI) but that's a feature yet to come I know. No complaints about on the basis it's free, but people have to admit 10s for intersecting 150k polygons is not stellar given how we've been spoiled with geometries in the past. Anyway, maybe a warning is in order for people considering moving across what do you think? On 1 June 2010 07:00, Paul Ramsey mailto:pram...@opengeo.org pram...@opengeo.org wrote: Actually it's slower because the calculations require lots of transcendental math. Anyhow, it's slower. If anyone wants a quote on speed improvements, I'm happy to provide one, I have some good ideas about how to speed things up with some better data structures and caching. P. On Mon, May 31, 2010 at 2:48 PM, Paragon Corporation mailto:l...@pcorp.us l...@pcorp.us wrote: Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; From: mailto:postgis-users-boun...@postgis.refractions.net postgis-users-boun...@postgis.refractions.net [mailto: mailto:postgis-users-boun...@postgis.refractions.net postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText('SRID=4326;POLYGON((104.76 5625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E 06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315 A402F127C4A8C8343C0'::geography
Re: [postgis-users] No index usage on geography query plan?
I understand your explanation now - it's why our alternative queries over a pre-calculated 20km grid from overlaying the large boundary approached geometry in spatial performance (but then we incur far worse overall performance because of the sheer number of features in the query). The reason we are querying geographies instead of geometries is because we have large-scale global coverages that cross IDL and get warped near poles. To me, querying large scale features is the whole reason for geography, and so ideally my scenario should reflect best case performance not worst for geography. Otherwise if I had small features (minimal warping - eg my 20km grid), I'd just continue as I have for years and use geometry operators and handle IDL wrapping by adding 360 onto any negative longitude. Not fancy but fast. On 1 June 2010 12:27, Paragon Corporation l...@pcorp.us wrote: Paul, On that thought. Remember how geometry intersects performance significantly increased with prepared geometry algorithm, are we using that same kind of prepared geometry logic for geography. Just thinking out loud that aside from the slower algorithm, for cases like these where thousands of records need to be checked by the non-index check, we are losing performance there too. I imagine that may be an even easier enhancement. Not sure it makes a difference in this particular case since Nicholas bounding constant geography is pretty simple. Nicholas -- as Paul stated -- its not the index that is orders of magnitude slower, its the secondary check. For most use cases (needle in a hay stack where you are trying to get rid of 1,000,000 records and check 100 or so candidates, the speed is pretty decent and not too far off from geometry). Decent once we fix the ST_Intersects minor bug that prevents the index from being used. Though we should have a caveat somewhere explaining in detail these scenarios. Hope that helps, Regina -- *From:* postgis-users-boun...@postgis.refractions.net [mailto: postgis-users-boun...@postgis.refractions.net] *On Behalf Of *Paul Ramsey *Sent:* Monday, May 31, 2010 9:20 PM *To:* PostGIS Users Discussion *Subject:* Re: [postgis-users] No index usage on geography query plan? FYI, it is not the index that is slower, it is the op. The index is actually (surprisingly) faster. P On May 31, 2010, at 5:06 PM, Nicholas Bower n...@petangent.net wrote: In the meantime perhaps someone could add a warning to the docs that geography indexes are an order of magnitude slower than geometries for intersections? It's pretty important stuff for anyone appraising a migration. Actually I was after containment (db polygons enclose search ROI) but that's a feature yet to come I know. No complaints about on the basis it's free, but people have to admit 10s for intersecting 150k polygons is not stellar given how we've been spoiled with geometries in the past. Anyway, maybe a warning is in order for people considering moving across what do you think? On 1 June 2010 07:00, Paul Ramsey pram...@opengeo.org pram...@opengeo.org wrote: Actually it's slower because the calculations require lots of transcendental math. Anyhow, it's slower. If anyone wants a quote on speed improvements, I'm happy to provide one, I have some good ideas about how to speed things up with some better data structures and caching. P. On Mon, May 31, 2010 at 2:48 PM, Paragon Corporation l...@pcorp.us l...@pcorp.us wrote: Nicholas, I fear that may be the way it is. The intersects functionality of geography is slower than intersects of geometry because it piggy backs on the distance function rather than using an intersection matrix. With the index scan you are left with 10,347 records to check via the slower distance function. One thought is to create an ST_Intersects that uses the geometry _ST_Intersects instead of geography _ST_Distance you might get better speed or you might not. CREATE OR REPLACE FUNCTION st_intersectswitht(geography, geography) RETURNS boolean AS $$SELECT $1 $2 AND _ST_Intersects(ST_Transform(geometry($1), _ST_BestSRID($1)), ST_Transform(geometry($2), _ST_BestSRID($1)))$$ LANGUAGE 'sql' IMMUTABLE; From: postgis-users-boun...@postgis.refractions.net postgis-users-boun...@postgis.refractions.net [mailto: postgis-users-boun...@postgis.refractions.net postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Sunday, May 30, 2010 7:38 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having
Re: [postgis-users] No index usage on geography query plan?
Well the index says it is being used, however I'm still quite suspicious because of performance results below. I attach 3 versions of a simply query (Geography ST_Intersects, Geometry ST_Intersects, Geography ) which is a simple square ROI intersection over 150k rows, each having a single polygon around 50-80 verticies. 1) Geography ST_Intersects gives 13s :-( wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeographyFromText('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))')); Aggregate (cost=13556.17..13556.18 rows=1 width=0) (actual time=12886.056..12886.057 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.07..13554.65 rows=607 width=0) (actual time=17.168..12883.162 rows=8462 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Filter: (quicklook AND (_st_distance(swath_bounding, '010320E610010005315A402F127C4A8C8343C0E0696 0402F127C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography, 0::double pre cision, false) 1e-05::double precision)) - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.91 rows=10347 width=0) (actual time=8.148..8.148 rows=1 4261 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Total runtime: 12886.287 ms (7 rows) 2) Geometry ST_Intersects gives 1s :-) wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata_old wastac-# WHERE quicklook = True wastac-# AND ST_Intersects(swath_bounding,ST_GeomFromText('POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))', -1)); Aggregate (cost=9505.13..9505.14 rows=1 width=0) (actual time=95.681..95.682 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata_old (cost=506.77..9503.27 rows=745 width=0) (actual time=4.198..93.366 rows=7274 loops=1) Recheck Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C0E 0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry) Filter: (quicklook AND _st_intersects(swath_bounding, '010300010005315A402F127C4A8C8343C0E06960402F12 7C4A8C8343C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry)) - Bitmap Index Scan on t_swath_metadata_old_swath_bounding_key (cost=0.00..506.58 rows=16840 width=0) (actual time=3.557..3.557 ro ws=9020 loops=1) Index Cond: (swath_bounding '010300010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geometry) Total runtime: 95.757 ms (7 rows) 3) Geography bounding box 1s: wastac=# explain analyze SELECT count(1) AS count_1 wastac-# FROM wastac.t_swath_metadata wastac-# WHERE wastac.t_swath_metadata.quicklook = True wastac-# AND swath_bounding ST_GeographyFromText('SRID=4326;POLYGON((104.765625 -39.0277188402,131.30859375 -39.0277188402,131.30859375 -15.7076627696,104.765625 -15.7076627696,104.765625 -39.0277188402))'); Aggregate (cost=10948.03..10948.04 rows=1 width=0) (actual time=30.583..30.584 rows=1 loops=1) - Bitmap Heap Scan on t_swath_metadata (cost=506.38..10943.48 rows=1820 width=0) (actual time=8.884..27.786 rows=9806 loops=1) Recheck Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C8343C00 000E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Filter: quicklook - Bitmap Index Scan on t_swath_metadata_swath_bounding_key (cost=0.00..505.92 rows=10348 width=0) (actual time=7.403..7.403 rows=1 4263 loops=1) Index Cond: (swath_bounding '010320E610010005315A402F127C4A8C8343C0E06960402F127C4A8C834 3C0E0696040BD0E48C6526A2FC000315A40BD0E48C6526A2FC000315A402F127C4A8C8343C0'::geography) Total runtime: 30.637 ms (7 rows) On 28 May 2010 16:31, Paragon Corporation l...@pcorp.us wrote: Okay I think the fix is a really
Re: [postgis-users] No index usage on geography query plan?
Okay I think the fix is a really simple one Change your ST_Intersects function to this and see if it behaves right CREATE OR REPLACE FUNCTION st_intersects(geography, geography) RETURNS boolean AS 'SELECT $1 $2 AND _ST_Distance($1, $2, 0.0, false) 0.1' LANGUAGE 'sql' IMMUTABLE COST 100; It should no longer need the help to use the index. _ From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Thursday, May 27, 2010 6:24 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Right you are. wastac= explain analyze select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') wastac- AND ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); Aggregate (cost=30.48..30.49 rows=1 width=0) (actual time=11.459..11.459 rows=1 loops=1) - Index Scan using t_tile_geometry_border_key on t_tile_geometry (cost=0.00..30.47 rows=2 width=0) (actual time=0.697..11.416 rows=95 loops=1) Index Cond: (border '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF4654 0C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geogra phy) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE7728 0AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0': :geography) Total runtime: 11.537 ms (5 rows) In case it's relevant this was how I created the geography column from a geometry one - I noticed the Polygon restriction had to be added in the last statement (ie you don't get it for free from the cast in the first statement). create table wastac.t_tile_geometry as select *, geography(ST_SetSRID(border, 4326)) as border_new, geography(ST_SetSRID(centre, 4326)) as centre_new from wastac.t_tile_geometry; alter table wastac.t_tile_geometry_new drop column border; alter table wastac.t_tile_geometry_new rename column border_new to border; alter table wastac.t_tile_geometry alter column border set data type geography(Polygon,4326); On 27 May 2010 15:42, Paragon Corporation l...@pcorp.us wrote: Nick, Okay we are seeing the same issue with our fastfoods data even with smaller windows. I think the clue is the plan here. The ST_Intersects geography function seems to be treated as a blackbox rather than a transparent function composed of and _ST_Distance The magic of the geometry ST_Intersects is that the query plan can see into the function and decomposes it into + _ST_Intersects For some reason, that's just not happening with geography. Maybe Paul or Mark have a clue why that is. So I bet this uses an index and is much faster select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') AND ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); Thanks Regina and Leo h http://www.postgis.us ttp://www.postgis.us _ From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Wednesday, May 26, 2010 6:42 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? That does look like a pretty huge bounding polygon, but the geography we agree should still be using the spatial index, so probably making the index cost higher than it should Fyi the border values are are simply composed of a regular 20km grid of ajoining polygon squares covering Australia (about 1.3M tiles). Incidentally, with the border of tiles being square, the bounding box *is* the shape I guess. It's not relevant, but the query returns a count 95 of these 20km tiles. We're dealing with satellite data - the datasets themselves have far bigger coverages (different table entirely), but this is why you've noticed the region of interest in the query is large. Nick, Can you just for contrast, try to force it to use the index by doing set enable_seqscan = off; explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Strange - no difference. wastac= set enable_seqscan = off; SET wastac
Re: [postgis-users] No index usage on geography query plan?
Paragon Corporation wrote: Mark, He did include it in an earlier email wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=364056.80..364056.81 rows=1 width=0) (actual time=8854.932..8854.933 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=0.00..362993.14 rows=425463 width=0) (actual time=2017.629..8854.879 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280A F46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::g eography) Total runtime: 8854.976 ms (4 rows) Ooops - looks like I missed the query switch somewhere in the thread ;) ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Paragon Corporation wrote: Nick, Okay we are seeing the same issue with our fastfoods data even with smaller windows. I think the clue is the plan here. The ST_Intersects geography function seems to be treated as a blackbox rather than a transparent function composed of and _ST_Distance The magic of the geometry ST_Intersects is that the query plan can see into the function and decomposes it into + _ST_Intersects Right. I'd quite like to see the explain analyze plan generated when performing the ST_Intersects() substitution manually, e.g. explain analyze select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') AND _ST_Distance(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))'), 0.0, false) 0.1; This should give us a better idea as to why this particular plan is being chosen. ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Right you are. wastac= explain analyze select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') wastac- AND ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); Aggregate (cost=30.48..30.49 rows=1 width=0) (actual time=11.459..11.459 rows=1 loops=1) - Index Scan using t_tile_geometry_border_key on t_tile_geometry (cost=0.00..30.47 rows=2 width=0) (actual time=0.697..11.416 rows=95 loops=1) Index Cond: (border '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF4654 0C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE7728 0AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 11.537 ms (5 rows) In case it's relevant this was how I created the geography column from a geometry one - I noticed the Polygon restriction had to be added in the last statement (ie you don't get it for free from the cast in the first statement). create table wastac.t_tile_geometry as select *, geography(ST_SetSRID(border, 4326)) as border_new, geography(ST_SetSRID(centre, 4326)) as centre_new from wastac.t_tile_geometry; alter table wastac.t_tile_geometry_new drop column border; alter table wastac.t_tile_geometry_new rename column border_new to border; alter table wastac.t_tile_geometry alter column border set data type geography(Polygon,4326); On 27 May 2010 15:42, Paragon Corporation l...@pcorp.us wrote: Nick, Okay we are seeing the same issue with our fastfoods data even with smaller windows. I think the clue is the plan here. The ST_Intersects geography function seems to be treated as a blackbox rather than a transparent function composed of and _ST_Distance The magic of the geometry ST_Intersects is that the query plan can see into the function and decomposes it into + _ST_Intersects For some reason, that's just not happening with geography. Maybe Paul or Mark have a clue why that is. So I bet this uses an index and is much faster select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') AND ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); Thanks Regina and Leo http://www.postgis.us -- *From:* postgis-users-boun...@postgis.refractions.net [mailto: postgis-users-boun...@postgis.refractions.net] *On Behalf Of *Nicholas Bower *Sent:* Wednesday, May 26, 2010 6:42 PM *To:* PostGIS Users Discussion *Subject:* Re: [postgis-users] No index usage on geography query plan? That does look like a pretty huge bounding polygon, but the geography we agree should still be using the spatial index, so probably making the index cost higher than it should Fyi the border values are are simply composed of a regular 20km grid of ajoining polygon squares covering Australia (about 1.3M tiles). Incidentally, with the border of tiles being square, the bounding box *is* the shape I guess. It's not relevant, but the query returns a count 95 of these 20km tiles. We're dealing with satellite data - the datasets themselves have far bigger coverages (different table entirely), but this is why you've noticed the region of interest in the query is large. Nick, Can you just for contrast, try to force it to use the index by doing set enable_seqscan = off; explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Strange - no difference. wastac= set enable_seqscan = off; SET wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); QUERY PLAN Aggregate (cost=1364056.80..1364056.81 rows=1
Re: [postgis-users] No index usage on geography query plan?
Nicholas Bower wrote: But simply swapping the query region above from geometry to geography we're back to no index usage, explain analyze select count(*) from wastac.t_tile_geometry_old where ST_Intersects(border, ST_GeographyFromText('SRID=4316;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=362164.77..362164.78 rows=1 width=0) (actual time=80302.237..80302.237 rows=1 loops=1) - Seq Scan on t_tile_geometry_old (cost=0.00..361101.11 rows=425463 width=0) (actual time=19680.252..80302.172 rows=95 loo ps=1) Filter: st_intersects((border)::geography, '010320DC1001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geogra phy) Total runtime: 80302.285 ms Right. So the give away in the above query is the cast in the st_intersects() filter; it looks as if your border column in wastac.t_tile_geometry_old is a *geometry* while your intersection geometry is a *geography*. Since PostgreSQL sees that you are trying to perform an operation on two different types, it cannot make use of the indexes on those two columns. Hence it performs a sequential scan across the whole table converting on the fly, which is going to take a lot more time. Try making a copy of your wastac.t_tile_geometry_old table but with a geography instead of geometry column for border, and you should see an improvement. HTH, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Try making a copy of your wastac.t_tile_geometry_old table but with a geography instead of geometry column for border, and you should see an improvement. That's precisely what I showed in the original post - geography intersecting geography column. See the table def. I was outlining in this latter geometry example that a 9s intersection is not right and there is a problem with geograhies. ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Mark, He did include it in an earlier email wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=364056.80..364056.81 rows=1 width=0) (actual time=8854.932..8854.933 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=0.00..362993.14 rows=425463 width=0) (actual time=2017.629..8854.879 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280A F46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::g eography) Total runtime: 8854.976 ms (4 rows) That does look like a pretty huge bounding polygon, but the geography we agree should still be using the spatial index, so probably making the index cost higher than it should Nick, Can you just for contrast, try to force it to use the index by doing set enable_seqscan = off; explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); That should force it to use the index unless it can't for some other reason. Leo and Regina, http://www.postgis.us -Original Message- From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mark Cave-Ayland Sent: Wednesday, May 26, 2010 8:44 AM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? Nick Bower wrote: Try making a copy of your wastac.t_tile_geometry_old table but with a geography instead of geometry column for border, and you should see an improvement. That's precisely what I showed in the original post - geography intersecting geography column. See the table def. I was outlining in this latter geometry example that a 9s intersection is not right and there is a problem with geograhies. Okay - so can we see the explain analyze output for your last query for the geography intersecting geography case for comparison? ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
That does look like a pretty huge bounding polygon, but the geography we agree should still be using the spatial index, so probably making the index cost higher than it should Fyi the border values are are simply composed of a regular 20km grid of ajoining polygon squares covering Australia (about 1.3M tiles). Incidentally, with the border of tiles being square, the bounding box *is* the shape I guess. It's not relevant, but the query returns a count 95 of these 20km tiles. We're dealing with satellite data - the datasets themselves have far bigger coverages (different table entirely), but this is why you've noticed the region of interest in the query is large. Nick, Can you just for contrast, try to force it to use the index by doing set enable_seqscan = off; explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Strange - no difference. wastac= set enable_seqscan = off; SET wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); QUERY PLAN Aggregate (cost=1364056.80..1364056.81 rows=1 width=0) (actual time=8909.585..8909.586 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=100.00..1362993.14 rows=425463 width=0) (actual time=2028.827..8909.533 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 8909.626 ms (4 rows) ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Nick, Okay we are seeing the same issue with our fastfoods data even with smaller windows. I think the clue is the plan here. The ST_Intersects geography function seems to be treated as a blackbox rather than a transparent function composed of and _ST_Distance The magic of the geometry ST_Intersects is that the query plan can see into the function and decomposes it into + _ST_Intersects For some reason, that's just not happening with geography. Maybe Paul or Mark have a clue why that is. So I bet this uses an index and is much faster select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))') AND ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); Thanks Regina and Leo h http://www.postgis.us ttp://www.postgis.us _ From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Nicholas Bower Sent: Wednesday, May 26, 2010 6:42 PM To: PostGIS Users Discussion Subject: Re: [postgis-users] No index usage on geography query plan? That does look like a pretty huge bounding polygon, but the geography we agree should still be using the spatial index, so probably making the index cost higher than it should Fyi the border values are are simply composed of a regular 20km grid of ajoining polygon squares covering Australia (about 1.3M tiles). Incidentally, with the border of tiles being square, the bounding box *is* the shape I guess. It's not relevant, but the query returns a count 95 of these 20km tiles. We're dealing with satellite data - the datasets themselves have far bigger coverages (different table entirely), but this is why you've noticed the region of interest in the query is large. Nick, Can you just for contrast, try to force it to use the index by doing set enable_seqscan = off; explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Strange - no difference. wastac= set enable_seqscan = off; SET wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143-31.316101,116.751709 -31.381779))')); QUERY PLAN Aggregate (cost=1364056.80..1364056.81 rows=1 width=0) (actual time=8909.585..8909.586 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=100.00..1362993.14 rows=425463 width=0) (actual time=2028.827..8909.533 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC 1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 8909.626 ms (4 rows) ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
On 25 May 2010 22:35, Mark Cave-Ayland mark.cave-ayl...@siriusit.co.ukwrote: Nicholas Bower wrote: Neither of the ST_Intersects clauses below invoke index usage according to explain output, despite docs saying they should automatically be doing bbox on the index; (cut) What's going on - the difference in total cost above proves to me the indexes are not being used. I think you're missing the point here; the aim of the planner is to work out which join order will produce the query that executes in the shortest time. Therefore just because an index is present does not necessarily mean it is correct to use it. From your first query (where the spatial index is not being used): Total runtime: 44744.241 ms From your second query (where the spatial index is being used): Total runtime: 65952.140 ms So I'd say that this is working exactly as it should be, since the join order chosen by the planner has resulted in the shortest query time. A repeated explain analyze on the first query showed about 45s second time around (same). However for the version, second time around was just 2s (from 65s). Saving 15s and losing caching seems like a bit of a false win no? ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
Perhaps this makes it more obvious - 9s to query a table of just 1.3M rows with ST_Intersects and 20ms using . explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=364056.80..364056.81 rows=1 width=0) (actual time=9175.760..9175.761 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=0.00..362993.14 rows=425463 width=0) (actual time=2164.049..9175.706 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE77280A F46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 9175.859 ms (4 rows) wastac= explain analyze select count(*) from wastac.t_tile_geometry where ST_Intersects(border, ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=364056.80..364056.81 rows=1 width=0) (actual time=8854.932..8854.933 rows=1 loops=1) - Seq Scan on t_tile_geometry (cost=0.00..362993.14 rows=425463 width=0) (actual time=2017.629..8854.879 rows=95 loops=1) Filter: st_intersects(border, '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE77280A F46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 8854.976 ms (4 rows) wastac= explain analyze select count(*) from wastac.t_tile_geometry where border ST_GeographyFromText('SRID=4326;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))'); Aggregate (cost=28.99..29.00 rows=1 width=0) (actual time=24.443..24.443 rows=1 loops=1) - Index Scan using t_tile_geometry_border_key on t_tile_geometry (cost=0.00..28.97 rows=6 width=0) (actual time=9.952..24.407 rows=112 loops=1) Index Cond: (border '010320E61001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF46540C 01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geography) Total runtime: 24.681 ms On 25 May 2010 22:45, Nicholas Bower n...@petangent.net wrote: On 25 May 2010 22:35, Mark Cave-Ayland mark.cave-ayl...@siriusit.co.ukwrote: Nicholas Bower wrote: Neither of the ST_Intersects clauses below invoke index usage according to explain output, despite docs saying they should automatically be doing bbox on the index; (cut) What's going on - the difference in total cost above proves to me the indexes are not being used. I think you're missing the point here; the aim of the planner is to work out which join order will produce the query that executes in the shortest time. Therefore just because an index is present does not necessarily mean it is correct to use it. From your first query (where the spatial index is not being used): Total runtime: 44744.241 ms From your second query (where the spatial index is being used): Total runtime: 65952.140 ms So I'd say that this is working exactly as it should be, since the join order chosen by the planner has resulted in the shortest query time. A repeated explain analyze on the first query showed about 45s second time around (same). However for the version, second time around was just 2s (from 65s). Saving 15s and losing caching seems like a bit of a false win no? ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] No index usage on geography query plan?
What does the output of: SELECT version(), postgis_full_version(); return? PostgreSQL 8.4.3 on x86_64-pc-solaris2.10, compiled by GCC gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath), 64-bit | POSTGIS=1.5.1 GEOS=3.2.1-CAPI-1.6.1 PROJ=Rel. 4.7.1, 23 September 2009 LIBXML=2.6.23 USE_STATS I went back to the pre-geography verison of the table that still had geometries (SRID=-1) and it's a different story with ST_Intersects. Quite clearly there's a problem here - ST_Intersects printing out index usage fine when geometries are used. explain analyze select count(*) from wastac.t_tile_geometry_old where ST_Intersects(border, ST_GeometryFromText('POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))', -1)); Aggregate (cost=256.30..256.31 rows=1 width=0) (actual time=81.997..81.998 rows=1 loops=1) - Bitmap Heap Scan on t_tile_geometry_old (cost=5.09..256.25 rows=20 width=0) (actual time=76.303..81.950 rows=96 loops=1) Recheck Cond: (border '01030001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C0DF1 B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geometry) Filter: _st_intersects(border, '01030001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F56393564 0C0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geometry) - Bitmap Index Scan on t_tile_geometry_old_border_key (cost=0.00..5.09 rows=61 width=0) (actual time=0.307..0.307 ro ws=100 loops=1) Index Cond: (border '01030001000500F8C610001C305D40B24CBF44BC613FC0D6E253008C385D40DE72F563935640C 0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geometry) Total runtime: 82.107 ms But simply swapping the query region above from geometry to geography we're back to no index usage, explain analyze select count(*) from wastac.t_tile_geometry_old where ST_Intersects(border, ST_GeographyFromText('SRID=4316;POLYGON((116.751709 -31.381779,116.883545 -32.676373,114.741211 -32.796510,114.796143 -31.316101,116.751709 -31.381779))')); Aggregate (cost=362164.77..362164.78 rows=1 width=0) (actual time=80302.237..80302.237 rows=1 loops=1) - Seq Scan on t_tile_geometry_old (cost=0.00..361101.11 rows=425463 width=0) (actual time=19680.252..80302.172 rows=95 loo ps=1) Filter: st_intersects((border)::geography, '010320DC1001000500F8C610001C305D40B24CBF44BC613FC0D6E253008 C385D40DE72F563935640C0DF1B430070AF5C40EE77280AF46540C01EFCC401F4B25C409F3BC1FEEB503FC0F8C610001C305D40B24CBF44BC613FC0'::geogra phy) Total runtime: 80302.285 ms \d wastac.t_tile_geometry_old Table wastac.t_tile_geometry_old Column | Type | Modifiers -+--+--- tile_id | integer | not null grid_id | integer | not null centre | geometry | border | geometry | Indexes: t_tile_geometry_old_border_key gist (border) ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users