Hi hackers, # Background
PostgreSQL maintains a list of temporary tables for 'on commit drop/delete rows' via an on_commits list in the session. Once a transaction accesses a temp table or namespace, the XACT_FLAGS_ACCESSEDTEMPNAMESPACE flag is set. Before committing, the PreCommit_on_commit_actions function truncates all 'commit delete rows' temp tables, even those not accessed in the current transaction. Commit performance can degrade if there are many such temp tables. In practice, users created many 'commit delete rows' temp tables in a session, but each transaction only accessed a few. With varied access frequency, users were reluctant to change to 'on commit drop'. Below is an example showing the effect of the number of temp tables on commit performance: ``` -- 100 DO $$ DECLARE begin FOR i IN 1..100 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 1.325 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 1.330 ms ``` ``` -- 1000 DO $$ DECLARE begin FOR i IN 1..1000 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 10.939 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 10.955 ms ``` ``` -- 10000 DO $$ DECLARE begin FOR i IN 1..10000 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 110.253 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 175.875 ms ``` # Solution An intuitive solution is to truncate only the temp tables that the current process has accessed upon transaction commit. In the attached patch (based on HEAD): - A Bloom filter (can also be a list or hash table) maintains the temp tables accessed by the current transaction. - Only temp tables filtered through the Bloom filter need truncation. False positives may occur, but they are acceptable. - The Bloom filter is reset at the start of the transaction, indicating no temp tables have been accessed by the current transaction yet. After optimization, the performance for the same case is as follows: ``` -- 100 DO $$ DECLARE begin FOR i IN 1..100 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 0.447 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 0.453 ms ``` ``` -- 1000 DO $$ DECLARE begin FOR i IN 1..1000 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 0.531 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 0.567 ms ``` ``` -- 10000 DO $$ DECLARE begin FOR i IN 1..10000 LOOP EXECUTE format('CREATE TEMP TABLE temp_table_%s (id int) on commit delete ROWS', i) ; END LOOP; END; $$; postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 1.370 ms postgres=# insert into temp_table_1 select 1; INSERT 0 1 Time: 1.362 ms ``` Hoping for some suggestions from hackers. Best Regards, Fei Changhong
v1-0000-Optimize-commit-with-temp-tables.patch
Description: Binary data