here they are: (I replace the column and table names) also I post 2 more remarks, one on left join, and one on the test I did on postgres 15 postgis 3.3...
2023=# explain (analyze, buffers) select A.x, count(B.x) from A left join B on st_within(B.geom, A.geom) group by A.x; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=212638398.98..212701792.16 rows=20 width=16) (actual time=86717.857..86757.820 rows=11 loops=1) Group Key: A.x Buffers: shared hit=4243867 -> Sort (cost=212638398.98..212659529.97 rows=8452398 width=16) (actual time=86717.851..86727.334 rows=421307 loops=1) Sort Key: A.x Sort Method: quicksort Memory: 37963kB Buffers: shared hit=4243867 -> Nested Loop Left Join (cost=0.00..211521459.31 rows=8452398 width=16) (actual time=17.473..86642.332 rows=421307 loops=1) Join Filter: st_within(B.geom, A.geom) Rows Removed by Join Filter: 4229377 Buffers: shared hit=4243867 -> Seq Scan on A (cost=0.00..9.20 rows=20 width=17752) (actual time=0.009..0.043 rows=11 loops=1) Buffers: shared hit=9 -> Materialize (cost=0.00..22309.83 rows=422789 width=40) (actual time=0.001..23.392 rows=422789 loops=11) Buffers: shared hit=15968 -> Seq Scan on B (cost=0.00..20195.89 rows=422789 width=40) (actual time=0.006..57.651 rows=422789 loops=1) Buffers: shared hit=15968 Planning Time: 0.693 ms Execution Time: 86763.087 ms (19 lignes) 2023=# explain (analyze, buffers) select A.x, count(B.x) from A, B where st_within(B.geom, A.geom) group by A.x; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize GroupAggregate (cost=6301606.00..6301608.60 rows=20 width=16) (actual time=11857.363..11863.212 rows=6 loops=1) Group Key: A.x Buffers: shared hit=2128836 -> Gather Merge (cost=6301606.00..6301608.30 rows=20 width=16) (actual time=11857.359..11863.207 rows=12 loops=1) Workers Planned: 1 Workers Launched: 1 Buffers: shared hit=2128836 -> Sort (cost=6300605.99..6300606.04 rows=20 width=16) (actual time=11840.355..11840.356 rows=6 loops=2) Sort Key: A.x Sort Method: quicksort Memory: 25kB Worker 0: Sort Method: quicksort Memory: 25kB Buffers: shared hit=2128836 -> Partial HashAggregate (cost=6300605.36..6300605.56 rows=20 width=16) (actual time=11840.331..11840.332 rows=6 loops=2) Group Key: A.x Buffers: shared hit=2128825 -> Nested Loop (cost=0.13..6275745.36 rows=4971999 width=16) (actual time=0.505..11781.817 rows=210651 loops=2) Buffers: shared hit=2128825 -> Parallel Seq Scan on B (cost=0.00..18454.99 rows=248699 width=40) (actual time=0.005..22.859 rows=211395 loops=2) Buffers: shared hit=15968 -> Index Scan using A_geom_idx on A (cost=0.13..25.15 rows=1 width=17752) (actual time=0.054..0.055 rows=1 loops=422789) Index Cond: (geom ~ B.geom) Filter: st_within(B.geom, geom) Rows Removed by Filter: 0 Buffers: shared hit=2112857 Planning Time: 0.252 ms Execution Time: 11863.357 ms (26 lignes) Marc MILLAS Senior Architect +33607850334 www.mokadb.com On Sat, Jan 7, 2023 at 9:40 PM Erik Wienhold <e...@ewie.name> wrote: > > On 07/01/2023 20:46 CET Marc Millas <marc.mil...@mokadb.com> wrote: > > > > Hi, > > > > postgres 12, postgis 3.0 > > > > I have a small table A, 11 rows with a varchar column x and a geometry > column y. > > gist index on the geometry column. > > the geometry do contains multipolygons (regions on a map) > > I have a second table B , same structure, around 420 000 rows. > > no index, > > the geometry do contains points. > > all geometries are on 4326 srid. > > > > If i ask to count points in each multipolygons: > > > > select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x; > > it takes 11 seconds (everything in shared buffers). > > If I do the very same thing as: > > select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group > by A.x; > > same result, but 85 seconds (every thing in shared buffers, again) > > if I redo asking with explain analyze, buffers, the plan is very > different. > > > > > > if I do create a gist index on geometry column of the big table, > both syntax takes 21 seconds. > > > > I get the feeling I am missing something.. (at least 2 things...) > > can someone shed some light ?? > > Please provide the executions plans for both queries with and without the > index on B.y. > > -- > Erik >