Hi hackers,
  
# Background

PostgreSQL maintains a list of temporary tables for 'on commit
drop/delete rows' via an on_commits list in the session. Once a
transaction accesses a temp table or namespace, the
XACT_FLAGS_ACCESSEDTEMPNAMESPACE flag is set. Before committing, the
PreCommit_on_commit_actions function truncates all 'commit delete
rows' temp tables, even those not accessed in the current transaction.
Commit performance can degrade if there are many such temp tables.


In practice, users created many 'commit delete rows' temp tables in a
session, but each transaction only accessed a few. With varied access
frequency, users were reluctant to change to 'on commit drop'.


Below is an example showing the effect of the number of temp tables
on commit performance:
```
-- 100
DO $$
DECLARE
    begin
        FOR i IN 1..100 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.325 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.330 ms
```


```
-- 1000
DO $$
DECLARE
    begin
        FOR i IN 1..1000 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 10.939 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 10.955 ms
```


```
-- 10000
DO $$
DECLARE
    begin
        FOR i IN 1..10000 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 110.253 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 175.875 ms
```


# Solution


An intuitive solution is to truncate only the temp tables that
the current process has accessed upon transaction commit.


In the attached patch (based on HEAD):
- A Bloom filter (can also be a list or hash table) maintains
the temp tables accessed by the current transaction.
- Only temp tables filtered through the Bloom filter need
truncation. False positives may occur, but they are
acceptable.
- The Bloom filter is reset at the start of the transaction,
indicating no temp tables have been accessed by the
current transaction yet.


After optimization, the performance for the same case is as
follows:
```
-- 100
DO $$
DECLARE
    begin
        FOR i IN 1..100 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.447 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.453 ms
```


```
-- 1000
DO $$
DECLARE
    begin
        FOR i IN 1..1000 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.531 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 0.567 ms
```


```
-- 10000
DO $$
DECLARE
    begin
        FOR i IN 1..10000 LOOP
            EXECUTE format('CREATE TEMP TABLE 
temp_table_%s (id int) on commit delete ROWS', i) ;
        END LOOP;
    END;
$$;
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.370 ms
postgres=# insert into temp_table_1 select 1;
INSERT 0 1
Time: 1.362 ms
```


Hoping for some suggestions from hackers.


Best Regards,
Fei Changhong


 

Attachment: v1-0000-Optimize-commit-with-temp-tables.patch
Description: Binary data

Reply via email to