Greg,
Can you elaborate on your statement "2530 in a sinlge polygon"? Do you
mean holes or subPolygons in a MultiPolygon?
Either way, currently JTS is not optimized for working with very large
numbers of holes and sub-polygons. As Paul says, we are currently
working on improving this. Hopefully we can at least match the
performance of the Other DMBS(es) for this situation!
Would it be possible for me to obtain this geometry for testing?
Martin
Gregory Williamson wrote:
Dear peoples,
I have a problem with a query that uses an absurdly large geometry
(2530 in a single polygon). This is srid -1 (part of a large test of
postgres vs some other database product). Everything has been vacuumed
and analyzed.
The initial search to find candidates in a target table is quite fast:
catest=# select count(*) from wtm_sub_cell w, order_line_item x WHERE
x.bbox && w.geometry AND x.id_as_int = 114672;
count
-------
13168
(1 row)
Time: 9.472 ms
Trying to get the list narrowed to geometries that are completely
contained by the requested shape is slow:
catest=# select count(*) from wtm_sub_cell w, order_line_item x WHERE
x.bbox && w.geometry AND distance(x.geometry,w.geometry) = 0 and
x.id_as_int = 114672;
count
-------
1112
(1 row)
Time: 69277.780 ms
So I have two questions:
a) anything better to use than "distance(x,y) = 0) ? I tried
st_within -- it is about the same speed but returns no polys, which is
strange to me, but I also haven't looked at these in detail yet. For
example:
catest=# select count(*) from wtm_sub_cell w, order_line_item x WHERE
x.bbox && w.geometry AND st_within(x.geometry,w.geometry) and
x.id_as_int = 114672;
count
-------
0
(1 row)
Time: 1173.185 ms
(same results with st_within(w.geometry,x.geometry):
catest=# select count(*) from wtm_sub_cell w, order_line_item x WHERE
x.bbox && w.geometry AND st_within(w.geometry,x.geometry) and
x.id_as_int = 114672;
count
-------
0
(1 row)
b) anything I can do to speed things up ? I have tried boosting work
mem to 16 megs (from 1) and it made no apparent difference.
I have a self contained test case that shows the same behavior -- the
one large poly and all the candidates in another table. Apologies for
the size; hopefully it's not been mangled in the transfers.
Explain analyze of the sample (the sequential is sensible since there
is only one row in the table):
catest=# explain analyze select count(*) from wsc_candidates w,
oli_req x WHERE w.geometry && x.bbox AND
distance(w.geometry,x.oli_req_geom) > 0 AND x.oli_req_id = 114672;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=20.28..20.29 rows=1 width=0) (actual
time=77232.858..77232.859 rows=1 loops=1)
-> Nested Loop (cost=0.00..9.30 rows=4389 width=0) (actual
time=6.389..77221.506 rows=12056 loops=1)
Join Filter: (distance(w.geometry, x.oli_req_geom) >
0::double precision)
-> Seq Scan on oli_req x (cost=0.00..1.01 rows=1
width=40602) (actual time=0.007..0.009 rows=1 loops=1)
Filter: (oli_req_id = 114672)
-> Index Scan using wsc_c_spatial_ndx on wsc_candidates w
(cost=0.00..8.27 rows=1 width=109) (actual time=0.022..25.991
rows=13168 loops=1)
Index Cond: (w.geometry && x.bbox)
Filter: (w.geometry && x.bbox)
Total runtime: 77232.901 ms
(9 rows)
Time: 77233.773 ms
And for the real thing:
catest=# explain analyze select count(*) from wtm_sub_cell w,
order_line_item x WHERE w.geometry && x.bbox AND
distance(w.geometry,x.geometry) = 0 AND x.id_as_int = 114672;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=141.83..141.84 rows=1 width=0) (actual
time=77457.587..77457.588 rows=1 loops=1)
-> Nested Loop (cost=5.99..141.83 rows=1 width=0) (actual
time=15.682..77456.541 rows=1112 loops=1)
Join Filter: (distance(w.geometry, x.geometry) = 0::double
precision)
-> Index Scan using oli_id_ndx on order_line_item x
(cost=0.00..8.30 rows=1 width=383) (actual time=0.012..0.018 rows=1
loops=1)
Index Cond: (id_as_int = 114672)
-> Bitmap Heap Scan on wtm_sub_cell w (cost=5.99..132.97
rows=32 width=109) (actual time=2.988..21.796 rows=13168 loops=1)
Filter: (w.geometry && x.bbox)
-> Bitmap Index Scan on wsc_geom_idx1
(cost=0.00..5.98 rows=32 width=0) (actual time=2.828..2.828 rows=13168
loops=1)
Index Cond: (w.geometry && x.bbox)
Total runtime: 77457.633 ms
(10 rows)
Time: 77458.458 ms
The tables involved by size:
catest=# select count(*) from wsc_candidates;
count
-------
13168
(1 row)
Time: 2.586 ms
catest=# select count(*) from oli_req;
count
-------
1
(1 row)
Time: 0.193 ms
catest=# select count(*) from wtm_sub_cell;
count
---------
6399928
(1 row)
Time: 1776.508 ms
catest=# select count(*) from order_line_item;
count
--------
395921
(1 row)
Time: 176.083 ms
Many thanks for your time and bandwidth!
Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company
Confidentiality Notice: This e-mail message, including any
attachments, is for the sole use of the intended recipient(s) and may
contain confidential and privileged information and must be protected
in accordance with those provisions. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply e-mail and destroy all
copies of the original message.
(My corporate masters made me say this.)
------------------------------------------------------------------------
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users