Hi,

Thanks to your efforts the performance of the partitioned table has improved 
greatly.
Since I evaluated the performance by combining the performance improvement 
patches proposed in PG12, I share it.

The purpose of this evaluation is to organize performance issues of the table 
which is partitioned large number like over thousand and I want to bring the 
partitioned table performance close to the no partitioned table performance.

I used pgbench.
On the premise that tables don't have data and plan_cache_mode is auto.

- source:

master(38763d6778 Date:   Thu Sep 20 15:52:39 2018 +1200) + v9 patch[1] + v5 
patch[2] + v3 patch[3]

[1] Reduce partition tuple routing overheads
    https://commitfest.postgresql.org/19/1690/

[2] Revise executor's handling of range table relations
    https://commitfest.postgresql.org/19/1758/

[3] Speed up planning with partitions
    https://commitfest.postgresql.org/19/1778/


- table definition:

When 6400 items of data is inserted to parent table, the each leaf partitions 
have the same number of items.
For example, the following DDL is the number of leaf partitions is a hundred.

create table test.accounts_history(id serial, aid int, delta int, mtime 
timestamp without time zone) partition by range(aid);

create table test.account_part_1 partition of test.accounts for values from (1) 
to (65);
create table test.account_part_2 partition of test.accounts for values from 
(65) to (129);
...
create table test.account_part_100 partition of test.accounts for values from 
(6337) to (6400);

create table test.ah_part_1 partition of test.accounts_history for values from 
(1) to (65);
create table test.ah_part_2 partition of test.accounts_history for values from 
(65) to (129);
...
create table test.ah_part_100 partition of test.accounts_history for values 
from (6337) to (6400);


- benchmark script:

I make SQL which only one leaf partition is targeted in each case of 
SELECT/UPDATE/DELETE/INSERT.

\set aid random(1, 6400)
\set delta random(-5000, 5000)

SELECT abalance FROM test.accounts WHERE aid = :aid;
  or
INSERT INTO test.accounts_history (aid, delta, mtime) VALUES (:aid, :delta, 
CURRENT_TIMESTAMP);
  or
UPDATE test.accounts SET abalance = abalance + :delta WHERE aid = :aid;
  or
DELETE FROM test.accounts where aid = :aid;


- results:

1. simple mode results:

part_num is the number of partition and 0 means no partitioned case.
tps_ex is tps of excluding connections establishing.

Also, after pgbench, I evaluate Planning time and Execution Time using explain 
analyze.
plan_time_avg and execute_time_avg are average of explain analyze when executed 
ten times.

pgbench -n -T 60 -r -f select.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 |  8285.83582 |        0.0528 |           0.0222
      100 |   5948.1711 |        0.1342 |           0.0306
      200 | 5436.438478 |          0.15 |           0.0298
      400 | 4523.867744 |         0.148 |             0.03
      800 | 3460.625739 |        0.1447 |           0.0305
     1600 | 2428.795542 |        0.1528 |           0.0303
     3200 | 1539.672214 |        0.1552 |           0.0316
     6400 |  880.965232 |        0.1704 |           0.0288
(8 rows)

pgbench -n -T 60 -r -f update.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 |  7360.58261 |        0.0596 |           0.0417
      100 | 4633.880563 |        0.1564 |            0.105
      200 | 3972.737702 |         0.152 |           0.1007
      400 |  3000.23471 |        0.1594 |           0.1039
      800 | 2139.676379 |        0.1664 |           0.1055
     1600 | 1348.553673 |         0.165 |           0.1056
     3200 |   787.48559 |        0.1774 |           0.1124
     6400 |  411.575671 |        0.1823 |           0.1089
(8 rows)

pgbench -n -T 60 -r -f delete.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 |  8133.84019 |         0.057 |           0.0403
      100 | 5150.452458 |        0.1398 |           0.0936
      200 |  4352.69018 |        0.1414 |           0.0964
      400 |  3298.86364 |        0.1459 |            0.099
      800 | 2245.946178 |        0.1559 |           0.1029
     1600 |  1386.92366 |        0.1591 |           0.1048
     3200 |  802.024765 |        0.1617 |           0.1042
     6400 |  407.214158 |         0.168 |           0.1087
(8 rows)

pgbench -n -T 60 -r -f insert.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 | 5246.142416 |        0.0307 |           0.0601
      100 | 2190.331571 |        0.0311 |           0.3587
      200 | 1452.601752 |        0.0301 |           0.5065
      400 |  863.771879 |         0.031 |           0.7864
      800 |  482.528223 |        0.0308 |            1.361
     1600 |  254.026173 |        0.0325 |           2.4999
     3200 |   133.04952 |        0.0327 |             4.94
     6400 |   68.541207 |        0.0335 |           9.8271
(8 rows)

SELECT/UPDATE/DELETE case

Planning Time and Execution Time are almost same. but, tps decreases as the 
number of partitions increases. why?

INSERT case

Execution Time is longer as the number of partitions increase.
This is why find_all_inheritors locks all leaf partitions. Improvement idea was 
already proposed in [1] but, not discussion started yet.


2. prepared mode results:

part_num and tps_ex are same with simple mode results.
plan_time_avg and execute_time are the average of the 7th to 10th out of 10 
times.

pgbench -n -M prepared -T 60 -r -f select.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 | 10939.35898 |         0.007 |            0.021
      100 | 4705.814419 |         0.063 |            0.025
      200 | 3884.922764 |         0.067 |           0.0255
      400 | 2762.569518 |       0.06575 |           0.0255
      800 | 1826.824659 |          0.07 |          0.02625
     1600 | 1119.961247 |         0.065 |            0.021
     3200 |  588.001498 |       0.08725 |          0.02775
     6400 |  218.817738 |        0.1325 |          0.03325
(8 rows)

pgbench -n -M prepared -T 60 -r -f update.sql

part_num |    tps_ex    | plan_time_avg | execute_time_avg
----------+--------------+---------------+------------------
        0 | 10996.995001 |       0.01475 |            0.029
      100 |  3711.790226 |       0.07975 |            0.074
      200 |  2724.792289 |        0.0865 |           0.1055
      400 |  1837.348771 |       0.09175 |            0.104
      800 |  1105.378276 |        0.0925 |          0.10425
     1600 |   619.924085 |       0.09425 |           0.1115
     3200 |   289.202665 |         0.109 |          0.12475
     6400 |   110.149098 |        0.1695 |           0.1585
(8 rows)

pgbench -n -M prepared -T 60 -r -f delete.sql

part_num |    tps_ex    | plan_time_avg | execute_time_avg
----------+--------------+---------------+------------------
        0 | 11398.458657 |         0.009 |          0.02625
      100 |  3997.105459 |         0.072 |          0.08675
      200 |  3124.704202 |        0.0745 |          0.09725
      400 |  2113.749027 |         0.083 |          0.10125
      800 |  1365.142607 |         0.081 |            0.107
     1600 |   740.248148 |        0.0895 |          0.10925
     3200 |   363.054579 |       0.09525 |          0.11375
     6400 |   124.795721 |       0.14725 |            0.158
(8 rows)

pgbench -n -M prepared -T 60 -r -f insert.sql

part_num |   tps_ex    | plan_time_avg | execute_time_avg
----------+-------------+---------------+------------------
        0 |  6308.66377 |       0.01175 |          0.04225
      100 | 2476.919703 |         0.014 |          0.26375
      200 |  1569.59956 |       0.01325 |          0.38625
      400 |    912.4026 |         0.014 |           0.6055
      800 |  502.504371 |        0.0145 |          1.05725
     1600 |   259.97686 |       0.01425 |           1.9835
     3200 |  136.809811 |       0.01225 |            3.877
     6400 |   70.489791 |        0.0105 |          7.76575
(8 rows)

SELECT/UPDATE/DELETE case

As with simple mode results, tps decreases as the number of partitions 
increases. why?
Also, I wonder why no prepared case is faster than prepared case.

When I see all 10 times results of the explain analyze, I realize Planning Time 
of 6th explain analyze is much longer.
why does it take so long to build a new generic plan?

part_num | num | plan_time | execute_time
----------+-----+-----------+--------------
     6400 |   1 |     0.459 |        0.043
     6400 |   2 |     0.168 |        0.028
     6400 |   3 |     0.169 |        0.028
     6400 |   4 |     0.151 |        0.027
     6400 |   5 |     0.148 |        0.027
     6400 |   6 |   449.213 |        0.052
     6400 |   7 |     0.142 |        0.033
     6400 |   8 |      0.12 |        0.034
     6400 |   9 |     0.136 |        0.033
     6400 |  10 |     0.132 |        0.033

INSERT case

I think the cause of performance degradation is the same as simple mode.


In the case of prepared statement, even if the target of SELECT statement is 
one leaf partition, since LockRelationOid of AcquireExecutorLocks is executed 
for all leaves, the performance decreases as the number of partitions increases.
Can AcquireExecutorLocks lock only the target leaf partition?

I am planning to investigate using a system TAP etc. for other bottlenecks.
If you have any other convenient method, please let me know.
Also, if there is something already known as a bottleneck, please let me know.

regards,

--
Sho Kato

Reply via email to