Tom, Thanks for the report. I've ticketed as a bug - https://trac.osgeo.org/postgis/ticket/4635
Thanks, Regina -----Original Message----- From: postgis-users [mailto:[email protected]] On Behalf Of Tom Kazimiers Sent: Monday, February 3, 2020 5:06 PM To: PostGIS Users Discussion <[email protected]> Subject: Re: [postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade I took a bit of published data (in my case neuron morphologies from [0]) and created an isolated database that only contained the minimal versions of the tables referenced in this query. Both tables have 328738 rows and I crated all the indices that are there in the original tables. With this, I believe, the effect is visible. The 12MB SQL dump (in Postgres custom format) can be found here: https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-test.pgsql If I run the query from my initial email in both Postgres 11 / PostGIS 2.5 and Postgres 12 / PostGIS 3, I get the following query plans on the machine that also ran the databases of the original query. First, Postgres 11/ PostGIS 2.5: Limit (cost=51.71..498.33 rows=102 width=16) (actual time=2.023..2.697 rows=215 loops=1) Buffers: shared hit=1159 CTE bb_edge -> Bitmap Heap Scan on treenode_edge te (cost=2.43..44.77 rows=1 width=16) (actual time=1.348..1.804 rows=143 loops=1) Recheck Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) Filter: ((edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision)) Heap Blocks: exact=119 Buffers: shared hit=167 -> Bitmap Index Scan on treenode_edge_gix (cost=0.00..2.43 rows=19 width=0) (actual time=1.321..1.321 rows=143 loops=1) Index Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) Buffers: shared hit=48 -> Nested Loop (cost=6.93..453.55 rows=102 width=16) (actual time=2.022..2.674 rows=215 loops=1) Buffers: shared hit=1159 -> HashAggregate (cost=6.51..9.57 rows=102 width=8) (actual time=2.004..2.045 rows=259 loops=1) Group Key: bb_edge.id Buffers: shared hit=167 -> Append (cost=0.00..6.26 rows=102 width=8) (actual time=1.351..1.917 rows=338 loops=1) Buffers: shared hit=167 -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1 width=8) (actual time=1.351..1.863 rows=143 loops=1) Buffers: shared hit=167 -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06 rows=1 width=8) (actual time=0.000..0.017 rows=143 loops=1) -> ProjectSet (cost=0.00..1.55 rows=100 width=8) (actual time=0.003..0.008 rows=52 loops=1) -> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1) -> Index Scan using treenode_pkey on treenode t1 (cost=0.42..4.32 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=259) Index Cond: (id = bb_edge.id) Buffers: shared hit=992 Planning Time: 0.755 ms Execution Time: 2.765 ms And this is Postgres 12 / PostGIS 3: Limit (cost=32.51..220.67 rows=54 width=16) (actual time=2.934..3.876 rows=215 loops=1) Buffers: shared hit=1190 CTE bb_edge -> Index Scan using treenode_edge_gix on treenode_edge te (cost=0.53..28.58 rows=1 width=16) (actual time=0.276..2.641 rows=143 loops=1) Index Cond: ((edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) AND (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))) Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision)) Buffers: shared hit=198 -> Nested Loop (cost=3.94..192.09 rows=54 width=16) (actual time=2.931..3.847 rows=215 loops=1) Buffers: shared hit=1190 -> HashAggregate (cost=3.51..5.13 rows=54 width=8) (actual time=2.893..2.948 rows=259 loops=1) Group Key: bb_edge.id Buffers: shared hit=198 -> Append (cost=0.00..3.38 rows=54 width=8) (actual time=0.285..2.768 rows=338 loops=1) Buffers: shared hit=198 -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1 width=8) (actual time=0.284..2.704 rows=143 loops=1) Buffers: shared hit=198 -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06 rows=1 width=8) (actual time=0.001..0.020 rows=143 loops=1) -> ProjectSet (cost=0.00..0.83 rows=52 width=8) (actual time=0.006..0.013 rows=52 loops=1) -> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) -> Index Scan using treenode_pkey on treenode t1 (cost=0.42..3.43 rows=1 width=16) (actual time=0.003..0.003 rows=1 loops=259) Index Cond: (id = bb_edge.id) Buffers: shared hit=992 Planning Time: 1.148 ms Execution Time: 3.981 ms And just for easier context, this is the query I used (from the first mail): EXPLAIN (ANALYZE, BUFFERS) WITH bb_edge AS ( SELECT te.id, te.parent_id FROM treenode_edge te WHERE te.edge &&& ST_MakeLine(ARRAY[ ST_MakePoint(471548.0, 330140.0, 160440.0), ST_MakePoint(542460.0, 290140.0, 160400.0)] ::geometry[]) AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[ ST_MakePoint(471548.0, 290140.0, 160420.0), ST_MakePoint(542460.0, 290140.0, 160420.0), ST_MakePoint(542460.0, 330140.0, 160420.0), ST_MakePoint(471548.0, 330140.0, 160420.0), ST_MakePoint(471548.0, 290140.0, 160420.0)]::geometry[])), 20.0) AND te.project_id = 1 ) SELECT t1.id, t1.parent_id FROM ( SELECT id FROM bb_edge UNION SELECT parent_id FROM bb_edge UNION SELECT UNNEST(ARRAY[41851780,25932042,25932043,14274317,25932045, 25932050,25932052,41876633,25932070,42769832,25932073,25932076, 42769838,25932082,25932084,27641652,25220534,25932087,41421110, 26002235,36359611,25932094,36359614,41421118,41420481,41876677, 25932101,27641926,27641927,27642056,25932233,41420487,25932235, 25932108,27642059,40327244,25932111,25932243,25967062,25967066, 25967067,25967069,25967070,25932130,27642082,25932132,27642085, 25967080,25967082,25967084,27642099,28873207]::bigint[]) LIMIT 80000; Let me know if I can be of more help. Thanks, Tom [0] https://www.cell.com/cell/pdf/S0092-8674(18)30787-6.pdf On Mon, Feb 03, 2020 at 02:46:59AM -0500, Regina Obe wrote: >It could be the change in logic of _ST_3DDWithin. That was changed to support >more 3d types, so that might have caused a slow down elsewhere. >If you can provide a dataset that would be great. I'll try to do some >benchmarks on some 3d sets I have lying around. > >Thanks, >Regina > >-----Original Message----- >From: postgis-users [mailto:[email protected]] On >Behalf Of Tom Kazimiers >Sent: Monday, February 3, 2020 12:01 AM >To: PostGIS Users Discussion <[email protected]> >Subject: Re: [postgis-users] Changed intersection test speed after PG >11 -> 12 and PostGIS 2.5 -> 3 upgrade > >Hi Regina and Paul, > >On Sun, Feb 02, 2020 at 07:42:06PM -0800, Paul Ramsey wrote: >>> On Feb 2, 2020, at 7:37 PM, Regina Obe <[email protected]> wrote: >>> >>> If it does, can you try changing the function to your old 11 definition and >>> see if that makes the answers the same. Then at least we'll know it's the >>> change in definition and can work from there. > >Thanks for your replies. As far as I can tell, the definition I have for >ST_3DDWithin is the first version Regina posted (i.e. PostGIS 3.0): > ># \df+ ST_3DDWithin > > List of functions > Schema | Name | Result data type | Argument data > types | Type | Volatility | Parallel | Owner | Security | > Access privileges | Language | Source code | Description > > --------+--------------+------------------+--------------------------------------------------+------+------------+----------+----------+----------+-------------------+----------+------------------+------------- > public | st_3ddwithin | boolean | geom1 geometry, geom2 geometry, > double precision | func | immutable | safe | postgres | invoker | > | c | LWGEOM_dwithin3d | > >For the sake of completeness, this is my exact PostGIS version in my >Postgres >12.1 environment: > ># select PostGIS_full_version(); > > postgis_full_version > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > POSTGIS="3.0.0 r17983" [EXTENSION] PGSQL="120" GEOS="3.7.1-CAPI-1.11.1 > 27a5e771" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.3" > LIBJSON="0.11.99" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)" > >With this and the default ST_3DDWithin version, my query plan is: > > Limit (cost=32.79..230.14 rows=54 width=54) (actual time=193.840..202.393 > rows=1569 loops=1) > Buffers: shared hit=14391 > CTE bb_edge > -> Index Scan using treenode_edge_gix on treenode_edge te > (cost=0.67..28.71 rows=1 width=16) (actual time=0.809..192.138 rows=1004 > loops=1) > Index Cond: ((edge &&& > '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) > AND (edge &&& > st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, > '20'::double precision))) > Filter: ((project_id = 1) AND st_3ddwithin(edge, > '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, > '20'::double precision)) > Rows Removed by Filter: 5 > Buffers: shared hit=6546 > -> Nested Loop (cost=4.08..201.42 rows=54 width=54) (actual > time=193.837..202.205 rows=1569 loops=1) > Buffers: shared hit=14391 > -> HashAggregate (cost=3.51..5.13 rows=54 width=8) (actual > time=193.796..194.101 rows=1570 loops=1) > Group Key: bb_edge.id > Buffers: shared hit=6546 > -> Append (cost=0.00..3.38 rows=54 width=8) (actual > time=0.817..193.016 rows=2060 loops=1) > Buffers: shared hit=6546 > -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1 > width=8) (actual time=0.816..192.664 rows=1004 loops=1) > Buffers: shared hit=6546 > -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06 > rows=1 width=8) (actual time=0.002..0.130 rows=1004 loops=1) > -> ProjectSet (cost=0.00..0.83 rows=52 width=8) > (actual time=0.008..0.015 rows=52 loops=1) > -> Result (cost=0.00..0.03 rows=1 width=0) > (actual time=0.001..0.001 rows=1 loops=1) > -> Index Scan using treenode_id_pkey on treenode t1 > (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570) > Index Cond: (id = bb_edge.id) > Buffers: shared hit=7845 Planning Time: 1.211 ms > Execution Time: 202.791 ms > >If I run it again, the query time drops by ~15ms. Once I apply the older >definition of ST_3DDWithin, I can't observer any other change in the timing. > >The index condition does indeed change back to a version wihout the >st_expand(), but timing wise it doesn't seem to have any real effect: > > Limit (cost=4483.06..4680.40 rows=54 width=54) (actual > time=186.089..194.138 rows=1569 loops=1) > Buffers: shared hit=14391 > CTE bb_edge > -> Index Scan using treenode_edge_gix on treenode_edge te > (cost=0.42..4478.98 rows=1 width=16) (actual time=0.880..184.426 rows=1004 > loops=1) > Index Cond: (edge &&& > '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) > Filter: ((project_id = 1) AND (edge && > '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry) > AND > ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry > && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, > '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, > '20'::double precision)) > Rows Removed by Filter: 5 > Buffers: shared hit=6546 > -> Nested Loop (cost=4.08..201.42 rows=54 width=54) (actual > time=186.087..193.954 rows=1569 loops=1) > Buffers: shared hit=14391 > -> HashAggregate (cost=3.51..5.13 rows=54 width=8) (actual > time=186.051..186.330 rows=1570 loops=1) > Group Key: bb_edge.id > Buffers: shared hit=6546 > -> Append (cost=0.00..3.38 rows=54 width=8) (actual > time=0.888..185.322 rows=2060 loops=1) > Buffers: shared hit=6546 > -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1 > width=8) (actual time=0.887..184.984 rows=1004 loops=1) > Buffers: shared hit=6546 > -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06 > rows=1 width=8) (actual time=0.002..0.126 rows=1004 loops=1) > -> ProjectSet (cost=0.00..0.83 rows=52 width=8) > (actual time=0.006..0.013 rows=52 loops=1) > -> Result (cost=0.00..0.03 rows=1 width=0) > (actual time=0.001..0.001 rows=1 loops=1) > -> Index Scan using treenode_id_pkey on treenode t1 > (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570) > Index Cond: (id = bb_edge.id) > Buffers: shared hit=7845 Planning Time: 1.004 ms > Execution Time: 194.332 ms > >And for comparison, this is the index scan in the 11/2.5 plan: > > -> Index Scan using treenode_edge_gix on treenode_edge te > (cost=0.42..344.70 rows=1 width=16) (actual time=0.569..26.662 rows=1004 > loops=1) > Index Cond: (edge &&& > '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) > Filter: ((edge && >'0103000080010000000500000000000000A0C71C410000000020B51141000000008094 >034100000000A0C71C4100000000C0261441000000008094034100000000208E2041000 >00000C026144100000000C095034100000000208E20410000000020B5114100000000C0 >95034100000000A0C71C410000000020B511410000 >000080940341'::geometry) AND (project_id = 1) AND >('0103000080010000000500000000000000F0C71C410000000070B5114100000000209 >5034100000000F88D20410000000070B51141000000002095034100000000F88D204100 >00000070261441000000002095034100000000F0C71C410000000070261441000000002 >095034100000000F0C71C410000000070B51141000 >0000020950341'::geometry && st_expand(edge, '20'::double precision)) >AND _st_3ddwithin(edge, >'0103000080010000000500000000000000F0C71C410000000070B51141000000002095 >034100000000F88D20410000000070B51141000000002095034100000000F88D2041000 >0000070261441000000002095034100000000F0C71C4100000000702614410000000020 >95034100000000F0C71C410000000070B511410000 >000020950341'::geometry, '20'::double precision)) > Rows Removed by Filter: 5 > Buffers: shared hit=1616 > >To me it seems the index condition and the filter are the same now and aren't >actually the main source of the change. > >>If this is the case, can you share your data or some mangled version >>that exhibits the same performance gradient change from 2.5 to 3? We >>ll need something to profile to figure out where the extra time is >>going > >I think this should be possible, I try to find a small public example data set >where I can recreate this problem. The one I am working with contains >unpublished data and might be too big to transfer easily. > >If you have any other ideas, I am happy to test them in the meantime. > >Cheers, >Tom >_______________________________________________ >postgis-users mailing list >[email protected] >https://lists.osgeo.org/mailman/listinfo/postgis-users > >_______________________________________________ >postgis-users mailing list >[email protected] >https://lists.osgeo.org/mailman/listinfo/postgis-users _______________________________________________ postgis-users mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/postgis-users _______________________________________________ postgis-users mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/postgis-users
