Hello everyone,

I just want to report the outcomes of my latest performance research.

Postgres 12.1 and Postgres 12.2 are both restrictively slow in performing our 
geometry processing queries - no matter what postgis version (3.0.0, 2.5) I use 
and how the raw data came into the db. Setting jit on or off and/or allow 
parallel workers and ANALYZE before the queries hasn’t a significant effect on 
my test query. 

I installed Postgres 10.4 with postgis 2.4 on the same machine on a separate 
cluster and it runs the queries in fraction of time. See explain analyze output 
below.

I haven’t tested the newly released postgis 3.0.1 since there is no easy 
installation so far (as long as I see).
As long as I don’t see a clear way to break it down further we will stay with 
postgres 10.4.

Regards,

Stefan

—————————
NEW SETUP WITH POSTGRES 12.4, POSTGIS 2.4


EXPLAIN ANALYZE CREATE TABLE osm_admin_lines_split AS
SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
ST_SubDivide(ST_Collect(ST_ExteriorRing(the_geom)),5000) AS geometry
FROM (SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
(ST_Dump(geometry)).geom AS the_geom FROM osm_admin_areas) AS foo
GROUP BY osm_id, name, admin_level, name_en, name_fr, area, iso, is_in;
                                                                                
                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ProjectSet  (cost=272668581.44..3513224466.44 rows=612003000000 width=65) 
(actual time=56997.276..91803.763 rows=739733 loops=1)
   ->  GroupAggregate  (cost=272668581.44..297148701.44 rows=612003000 
width=65) (actual time=56997.263..69736.892 rows=611704 loops=1)
         Group Key: osm_admin_areas.osm_id, osm_admin_areas.name, 
osm_admin_areas.admin_level, osm_admin_areas.name_en, osm_admin_areas.name_fr, 
osm_admin_areas.area, osm_admin_areas.iso, osm_admin_areas.is_in
         ->  Sort  (cost=272668581.44..274198588.94 rows=612003000 width=65) 
(actual time=56997.204..61484.384 rows=708178 loops=1)
               Sort Key: osm_admin_areas.osm_id, osm_admin_areas.name, 
osm_admin_areas.admin_level, osm_admin_areas.name_en, osm_admin_areas.name_fr, 
osm_admin_areas.area, osm_admin_areas.iso, osm_admin_areas.is_in
               Sort Method: external merge  Disk: 4087800kB
               ->  Result  (cost=0.00..162527467.79 rows=612003000 width=65) 
(actual time=6.103..44923.073 rows=708178 loops=1)
                     ->  ProjectSet  (cost=0.00..3406687.79 rows=612003000 
width=65) (actual time=6.095..44633.240 rows=708178 loops=1)
                           ->  Seq Scan on osm_admin_areas  
(cost=0.00..190612.03 rows=612003 width=6983) (actual time=0.103..4156.204 
rows=611707 loops=1)
 Planning time: 4.031 ms
 Execution time: 193917.392 ms
(11 rows)

> Am 20.02.2020 um 14:49 schrieb Stefan Duling <stefan.dul...@mapz.com>:
> 
> Hi Imre,
> 
> thanks for showing the other discussions. It really seems like it’s a very 
> similar problem.
> 
> I tried:
> 
> - jit = off
> - max_parallel_workers_per_gather = 0
> - jit = off with max_parallel_workers_per_gather = 0
> - ANALYZE osm_admin_areas; before the query
> 
> Sadly those options have not the effect to bring the query time close to the 
> old setup with postgres 10
> 
> Next ideas I plan to test:
> - There is one further difference between the 2 setups: I use Imposm to 
> import the data. On the old setup (postgres10) I use v0.8.1, on the new setup 
> (postgres12) I use v0.10.0. Although the geometry column contains exactly the 
> same data in both setups, it could be worth to test it. Imposm was used by 
> the others in your linked discussions too.
> - Try to upgrade Postgres to 12.2
> 
> Regards,
> Stefan
> 
> 
>> Am 20.02.2020 um 13:08 schrieb Imre Samu <pella.s...@gmail.com 
>> <mailto:pella.s...@gmail.com>>:
>> 
>> Hi Stefan,
>> 
>> > I tried Postgis 2.5 on the new setup with Postgres 12.1. I can’t see any 
>> > significant differences.
>> 
>> Please re-test with:
>> -   "ANALYZE  osm_admin_areas; "   AND   " jit=off   
>> max_parallel_workers_per_gather=0" parameters 
>> based on similar problem:  
>> https://github.com/openstreetmap/osm2pgsql/issues/1045#issuecomment-570165252
>>  
>> <https://github.com/openstreetmap/osm2pgsql/issues/1045#issuecomment-570165252>
>>   
>> And check the debugging thread: 
>> https://www.postgresql.org/message-id/flat/16183-64843a4ebc44d9d2%40postgresql.org
>>  
>> <https://www.postgresql.org/message-id/flat/16183-64843a4ebc44d9d2%40postgresql.org>
>>  ( maybe there is a fix )
>> 
>> --------
>> 
>> And is it possible to upgrade to PG12.2 ( from 12.1 ) ? 
>> - There are lot of fixes : https://www.postgresql.org/docs/release/12.2/ 
>> <https://www.postgresql.org/docs/release/12.2/>
>> imho:  your old database:  "Postgres 10.12" is the latest PG10 ; and it is 
>> correct to compare with the latest PG12 )
>> 
>> regards,
>>  Imre
>> 
>> 
>> 
>> 
>> Stefan Duling <stefan.dul...@mapz.com <mailto:stefan.dul...@mapz.com>> ezt 
>> írta (időpont: 2020. febr. 20., Cs, 11:36):
>> Hi Paul,
>> 
>> thanks for your response!
>> 
>> I tried Postgis 2.5 on the new setup with Postgres 12.1. I can’t see any 
>> significant differences.
>> I also tried to use only one worker per gatherer. Although there is a little 
>> improvement, the query time still is doubled compared to the old setup with 
>> Postgres 10.
>> 
>> Almost all geometry related queries of my data processing pipeline need 
>> multiple calculation times while indexing and clustering is faster with 
>> Postgres 12.
>> 
>> I continue in trying to break it down further.
>> 
>> Regards,
>> 
>> Stefan
>> 
>> —————————
>> 
>> NEW SETUP WITH POSTGIS 2.5
>> 
>> 
>> EXPLAIN ANALYZE CREATE TABLE public.osm_admin_lines_split AS
>> SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> ST_SubDivide(ST_Collect(ST_ExteriorRing(the_geom)),5000) AS geometry
>> FROM (SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> (ST_Dump(geometry)).geom AS the_geom FROM osm_admin_areas) AS foo
>> GROUP BY osm_id, name, admin_level, name_en, name_fr, area, iso, is_in;
>>                                                                              
>>                          QUERY PLAN
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  ProjectSet  (cost=40703445.82..43928566.78 rows=609088000 width=65) (actual 
>> time=51587.707..506619.624 rows=750429 loops=1)
>>    ->  GroupAggregate  (cost=40703445.82..40727809.34 rows=609088 width=65) 
>> (actual time=51587.663..67802.117 rows=611704 loops=1)
>>          Group Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> osm_admin_areas.iso, osm_admin_areas.is_in
>>          ->  Sort  (cost=40703445.82..40704968.54 rows=609088 width=65) 
>> (actual time=51586.580..57029.848 rows=708178 loops=1)
>>                Sort Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> osm_admin_areas.iso, osm_admin_areas.is_in
>>                Sort Method: external merge  Disk: 4087808kB
>>                ->  Gather  (cost=1000.00..40638832.88 rows=609088 width=65) 
>> (actual time=1240.826..35018.389 rows=708178 loops=1)
>>                      Workers Planned: 4
>>                      Workers Launched: 4
>>                      ->  Result  (cost=0.00..40576924.08 rows=152272000 
>> width=65) (actual time=988.020..29198.984 rows=141636 loops=5)
>>                            ->  ProjectSet  (cost=0.00..986204.08 
>> rows=152272000 width=65) (actual time=988.012..29085.994 rows=141636 loops=5)
>>                                  ->  Parallel Seq Scan on osm_admin_areas  
>> (cost=0.00..186014.72 rows=152272 width=6545) (actual time=0.369..5959.617 
>> rows=122341 loops=5)
>>  Planning Time: 14.213 ms
>>  JIT:
>>    Functions: 110
>>    Options: Inlining true, Optimization true, Expressions true, Deforming 
>> true
>>    Timing: Generation 39.248 ms, Inlining 649.432 ms, Optimization 2369.206 
>> ms, Emission 1896.395 ms, Total 4954.282 ms
>>  Execution Time: 620994.272 ms
>> (18 rows)
>> 
>> —————————
>> 
>> NEW SETUP WITH POSTGIS 2.5 AND max_parallel_workers_per_gather = 1
>> 
>> 
>> EXPLAIN ANALYZE CREATE TABLE public.osm_admin_lines_split AS
>> SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> ST_SubDivide(ST_Collect(ST_ExteriorRing(the_geom)),5000) AS geometry
>> FROM (SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> (ST_Dump(geometry)).geom AS the_geom FROM osm_admin_areas) AS foo
>> GROUP BY osm_id, name, admin_level, name_en, name_fr, area, iso, is_in;
>>                                                                              
>>                          QUERY PLAN
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  ProjectSet  (cost=95352014.80..98577135.76 rows=609088000 width=65) (actual 
>> time=51995.515..493885.155 rows=750429 loops=1)
>>    ->  GroupAggregate  (cost=95352014.80..95376378.32 rows=609088 width=65) 
>> (actual time=51995.438..66093.453 rows=611704 loops=1)
>>          Group Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> osm_admin_areas.iso, osm_admin_areas.is_in
>>          ->  Sort  (cost=95352014.80..95353537.52 rows=609088 width=65) 
>> (actual time=51995.267..55931.535 rows=708178 loops=1)
>>                Sort Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> osm_admin_areas.iso, osm_admin_areas.is_in
>>                Sort Method: external merge  Disk: 4087808kB
>>                ->  Gather  (cost=1000.00..95287401.86 rows=609088 width=65) 
>> (actual time=969.364..38077.753 rows=708178 loops=1)
>>                      Workers Planned: 1
>>                      Workers Launched: 1
>>                      ->  Result  (cost=0.00..95225493.06 rows=358287000 
>> width=65) (actual time=755.966..34508.959 rows=354089 loops=2)
>>                            ->  ProjectSet  (cost=0.00..2070873.06 
>> rows=358287000 width=65) (actual time=755.958..34344.231 rows=354089 loops=2)
>>                                  ->  Parallel Seq Scan on osm_admin_areas  
>> (cost=0.00..188074.87 rows=358287 width=6545) (actual time=4.870..4186.864 
>> rows=305854 loops=2)
>>  Planning Time: 0.310 ms
>>  JIT:
>>    Functions: 53
>>    Options: Inlining true, Optimization true, Expressions true, Deforming 
>> true
>>    Timing: Generation 10.729 ms, Inlining 235.111 ms, Optimization 789.374 
>> ms, Emission 455.139 ms, Total 1490.353 ms
>>  Execution Time: 602556.200 ms
>> (18 rows)
>> 
>> 
>> 
>> > Am 19.02.2020 um 12:24 schrieb Paul Ramsey <pram...@cleverelephant.ca 
>> > <mailto:pram...@cleverelephant.ca>>:
>> > 
>> > If you can cut this down to a smaller, shorter query that shows the same 
>> > characteristics, I would like to see it in a profiler to determine if the 
>> > hot spots have moved. The explain seems more or less structurally the 
>> > same, which leaves a couple possibilities:
>> > - the st_collect running in parallel is actually an antipattern, (there’s 
>> > no performance benefit in the collect itself, so the only potential win is 
>> > in allowing other things under the collect to go parallel)
>> > - one of the other functions in your query has gotten a lot hotter
>> > The external sort is the same size in both cases, so that’s not the issue.
>> > You can probably confirm if parallelism is the problem by just turning it 
>> > off in pg12 and re-running, see if things get better. set 
>> > max_workers_per_gather to 1 or one of the other parallel config options.
>> > Thanks for gathering data,
>> > P
>> > 
>> >> On Feb 19, 2020, at 5:37 AM, Stefan Duling <stefan.dul...@mapz.com 
>> >> <mailto:stefan.dul...@mapz.com>> wrote:
>> >> 
>> >> Hi everyone,
>> >> 
>> >> at the moment I am trying out a new setup for our geo database server. 
>> >> Primarily I upgraded from Postgres and Postgis to newer versions. Beside 
>> >> of the memory both databases run on the same machine setups and contain 
>> >> identical data. Differences are:
>> >> 
>> >> OLD SETUP
>> >> 
>> >> 16GB RAM
>> >> Postgres 10.12
>> >> Postgis 2.4
>> >> shared_buffers = 4GB
>> >> work_mem = 128MB
>> >> maintenance_work_mem = 1536MB  
>> >> 
>> >> NEW SETUP
>> >> 
>> >> 32GB RAM
>> >> Postgres 12.1
>> >> Postgis 3.0
>> >> shared_buffers = 6GB
>> >> work_mem = 256MB
>> >> maintenance_work_mem = 3GB
>> >> 
>> >> 
>> >> Sadly I observe several queries that take multiple times to execute with 
>> >> the new setup. While the most expensive query took ~10h on the old setup, 
>> >> nearly 3 days are needed on the new setup. Currently I am analysing a 
>> >> more lightweight query, that execution time doubled with the new setup.
>> >> 
>> >> Can someone help me with the EXPLAIN ANALYZE results? Is Postgres 12 / 
>> >> Postgis 3 slower in general? Is the use of parallel workers ineffective 
>> >> in my use case?
>> >> 
>> >> Thanks in advance!
>> >> 
>> >> Stefan Duling
>> >> 
>> >> —————————
>> >> 
>> >> OLD SETUP
>> >> 
>> >> EXPLAIN ANALYZE CREATE TABLE osm_admin_lines_split AS
>> >> SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> >> ST_SubDivide(ST_Collect(ST_ExteriorRing(the_geom)),5000) AS geometry
>> >> FROM (SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, 
>> >> is_in, (ST_Dump(geometry)).geom AS the_geom FROM osm_admin_areas) AS foo
>> >> GROUP BY osm_id, name, admin_level, name_en, name_fr, area, iso, is_in;
>> >>                                                                           
>> >>                           QUERY PLAN
>> >> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >> ProjectSet  (cost=274906731.77..3540274986.77 rows=616689000000 width=65) 
>> >> (actual time=97024.487..141100.410 rows=739733 loops=1)
>> >>  ->  GroupAggregate  (cost=274906731.77..299574291.77 rows=616689000 
>> >> width=65) (actual time=97024.476..118509.455 rows=611704 loops=1)
>> >>        Group Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> >> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> >> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> >> osm_admin_areas.iso, osm_admin_areas.is_in
>> >>        ->  Sort  (cost=274906731.77..276448454.27 rows=616689000 
>> >> width=65) (actual time=97024.431..108902.529 rows=708178 loops=1)
>> >>              Sort Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> >> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> >> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> >> osm_admin_areas.iso, osm_admin_areas.is_in
>> >>              Sort Method: external merge  Disk: 4087720kB
>> >>              ->  Result  (cost=0.00..163888355.59 rows=616689000 
>> >> width=65) (actual time=42.029..84168.985 rows=708178 loops=1)
>> >>                    ->  ProjectSet  (cost=0.00..3549215.58 rows=616689000 
>> >> width=65) (actual time=42.019..83866.290 rows=708178 loops=1)
>> >>                          ->  Seq Scan on osm_admin_areas  
>> >> (cost=0.00..308514.89 rows=616689 width=6931) (actual 
>> >> time=16.288..9536.779 rows=611707 loops=1)
>> >> Planning time: 0.364 ms
>> >> Execution time: 244169.613 ms
>> >> (11 rows)
>> >> 
>> >> —————————
>> >> 
>> >> NEW SETUP
>> >> 
>> >> EXPLAIN ANALYZE CREATE TABLE public.osm_admin_lines_split AS
>> >> SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, is_in, 
>> >> ST_SubDivide(ST_Collect(ST_ExteriorRing(the_geom)),5000) AS geometry
>> >> FROM (SELECT osm_id, name, admin_level, name_en, name_fr, area, iso, 
>> >> is_in, (ST_Dump(geometry)).geom AS the_geom FROM osm_admin_areas) AS foo
>> >> GROUP BY osm_id, name, admin_level, name_en, name_fr, area, iso, is_in;
>> >>                                                                           
>> >>                           QUERY PLAN
>> >> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >> ProjectSet  (cost=390996247.38..409798165.99 rows=620679000 width=65) 
>> >> (actual time=38272.152..512018.424 rows=750429 loops=1)
>> >>  ->  GroupAggregate  (cost=390996247.38..391174692.59 rows=620679 
>> >> width=65) (actual time=38272.119..58302.727 rows=611704 loops=1)
>> >>        Group Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> >> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> >> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> >> osm_admin_areas.iso, osm_admin_areas.is_in
>> >>        ->  Sort  (cost=390996247.38..390997799.08 rows=620679 width=65) 
>> >> (actual time=38272.021..46632.706 rows=708178 loops=1)
>> >>              Sort Key: osm_admin_areas.osm_id, osm_admin_areas.name 
>> >> <http://osm_admin_areas.name/>, osm_admin_areas.admin_level, 
>> >> osm_admin_areas.name_en, osm_admin_areas.name_fr, osm_admin_areas.area, 
>> >> osm_admin_areas.iso, osm_admin_areas.is_in
>> >>              Sort Method: external merge  Disk: 4087328kB
>> >>              ->  Gather  (cost=1000.00..390930320.45 rows=620679 
>> >> width=65) (actual time=1035.032..23429.953 rows=708178 loops=1)
>> >>                    Workers Planned: 4
>> >>                    Workers Launched: 4
>> >>                    ->  Result  (cost=0.00..390867252.55 rows=155170000 
>> >> width=65) (actual time=817.294..20634.253 rows=141636 loops=5)
>> >>                          ->  ProjectSet  (cost=0.00..1390552.55 
>> >> rows=155170000 width=65) (actual time=817.287..20546.850 rows=141636 
>> >> loops=5)
>> >>                                ->  Parallel Seq Scan on osm_admin_areas  
>> >> (cost=0.00..226001.70 rows=155170 width=6702) (actual 
>> >> time=0.381..3937.298 rows=122341 loops=5)
>> >> Planning Time: 13.173 ms
>> >> JIT:
>> >>  Functions: 110
>> >>  Options: Inlining true, Optimization true, Expressions true, Deforming 
>> >> true
>> >>  Timing: Generation 27.663 ms, Inlining 580.814 ms, Optimization 2265.257 
>> >> ms, Emission 1222.527 ms, Total 4096.260 ms
>> >> Execution Time: 629042.211 ms
>> >> (18 rows)
>> >> _______________________________________________
>> >> postgis-users mailing list
>> >> postgis-users@lists.osgeo.org <mailto:postgis-users@lists.osgeo.org>
>> >> https://lists.osgeo.org/mailman/listinfo/postgis-users 
>> >> <https://lists.osgeo.org/mailman/listinfo/postgis-users>
>> > 
>> > _______________________________________________
>> > postgis-users mailing list
>> > postgis-users@lists.osgeo.org <mailto:postgis-users@lists.osgeo.org>
>> > https://lists.osgeo.org/mailman/listinfo/postgis-users 
>> > <https://lists.osgeo.org/mailman/listinfo/postgis-users>
>> 
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users@lists.osgeo.org <mailto:postgis-users@lists.osgeo.org>
>> https://lists.osgeo.org/mailman/listinfo/postgis-users 
>> <https://lists.osgeo.org/mailman/listinfo/postgis-users>_______________________________________________
>> postgis-users mailing list
>> postgis-users@lists.osgeo.org <mailto:postgis-users@lists.osgeo.org>
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-users

Reply via email to