Confine RI fast-path batching to the top transaction level The FK fast-path batching added in b7b27eb41a5 buffers rows in a transaction-lived cache (ri_fastpath_cache) keyed by constraint OID. Running user-defined cast and equality functions during a batch flush, together with the cache's lifetime and iteration, exposed two defects reachable by an unprivileged table owner.
First, on subtransaction abort ri_FastPathSubXactCallback discarded the entire cache. An entry's batch holds rows buffered by the enclosing transaction, not just the aborting subxact -- the cache is keyed by constraint, so a single entry can mix rows from multiple subxact levels. An internal subxact abort during after-trigger firing (e.g. a PL/pgSQL BEGIN ... EXCEPTION block) therefore dropped buffered rows of the outer transaction without running their FK checks, letting orphan rows commit behind a constraint that still reported itself valid. The discard also left relations opened by the batch unclosed, producing "resource was not closed" warnings. Second, ri_FastPathEndBatch flushes by iterating the cache with hash_seq_search. If flush-time user code inserts into a different fast-path FK table, a new entry is added to the cache mid-scan; it may land in a bucket the scan has already passed and never be reached, and ri_FastPathTeardown then destroys the cache without flushing it, silently dropping that check. Cleanly unwinding the cache on subxact abort would require tracking the originating subxact of each buffered row, since rows from different levels share an entry (the cache is keyed by constraint) and deferred constraints cannot be flushed early at a subxact boundary. Rather than add that bookkeeping, confine batching to the top transaction level: in RI_FKey_check, when GetCurrentTransactionNestLevel() > 1, use the per-row fast path (ri_FastPathCheck) instead of buffering. Rows checked inside a subtransaction are then verified immediately and roll back cleanly with their subtransaction, and the cache only ever holds top-level rows. With the cache confined to the top level, a subtransaction abort has nothing of its own to discard, so ri_FastPathSubXactCallback is removed along with its registration. For the second defect, add a cache-wide flag (ri_fastpath_flushing) set while ri_FastPathEndBatch iterates the cache. A re-entrant FK check arriving while the flag is set takes the per-row path rather than adding an entry to the cache being scanned, so no entry can be missed and torn down unflushed. The flag is cleared in a PG_FINALLY so a flush that throws (a reported violation or an error from user code) does not leave it stuck. As defensive insurance it is also cleared in ri_FastPathXactCallback() at transaction end. The per-row fast path still bypasses SPI and stays well ahead of the pre-19 SPI-based check. A fuller fix that preserves batching across subtransactions -- whether by tracking the originating subxact of each buffered row or by per-subxact cache stacks merged into the parent on commit -- is left for a future release. The subtransaction-abort case is covered by a new regression test. The mid-scan cross-table case depends on hash bucket placement and so is not reliably reproducible in a portable test, but the flag prevents it by construction. Reported-by: Nikolay Samokhvalov <[email protected]> Reviewed-by: Nikolay Samokhvalov <[email protected]> Reviewed-by: Ayush Tiwari <[email protected]> Reviewed-by: Junwang Zhao <[email protected]> Discussion: https://postgr.es/m/CAM527d9exRCdWrhJOnAxk_vACg7sr_yPoaJp_+uCFY0qP8v=a...@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/4113873a5ab0fb83a6f772f455b2842359d5ec50 Modified Files -------------- src/backend/utils/adt/ri_triggers.c | 86 +++++++++++++++++++++---------- src/test/regress/expected/foreign_key.out | 24 +++++++++ src/test/regress/sql/foreign_key.sql | 23 +++++++++ 3 files changed, 105 insertions(+), 28 deletions(-)
