Hi Junwang, On Mon, Dec 1, 2025 at 3:09 PM Junwang Zhao <[email protected]> wrote: > As Amit has already stated, we are approaching a hybrid "fast-path + fallback" > design. > > 0001 adds a fast path optimization for foreign key constraint checks > that bypasses the SPI executor, the fast path applies when the referenced > table is not partitioned, and the constraint does not involve temporal > semantics. > > With the following test: > > create table pk (a numeric primary key); > create table fk (a bigint references pk); > insert into pk select generate_series(1, 2000000); > > head: > > [local] zhjwpku@postgres:5432-90419=# insert into fk select > generate_series(1, 2000000, 2); > INSERT 0 1000000 > Time: 13516.177 ms (00:13.516) > > [local] zhjwpku@postgres:5432-90419=# update fk set a = a + 1; > UPDATE 1000000 > Time: 15057.638 ms (00:15.058) > > patched: > > [local] zhjwpku@postgres:5432-98673=# insert into fk select > generate_series(1, 2000000, 2); > INSERT 0 1000000 > Time: 8248.777 ms (00:08.249) > > [local] zhjwpku@postgres:5432-98673=# update fk set a = a + 1; > UPDATE 1000000 > Time: 10117.002 ms (00:10.117) > > 0002 cache fast-path metadata used by the index probe, at the current > time only comparison operator hash entries, operator function OIDs > and strategy numbers and subtypes for index scans. But this cache > doesn't buy any performance improvement. > > Caching additional metadata should improve performance for foreign key checks. > > Amit suggested introducing a mechanism for ri_triggers.c to register a > cleanup callback in the EState, which AfterTriggerEndQuery() could then > invoke to release per-statement cached metadata (such as the IndexScanDesc). > However, I haven't been able to implement this mechanism yet.
Thanks for working on this. I've taken your patches as a starting point and reworked the series into two patches (attached): 1st is your 0001+0002 as the core patch that adds a gated fast-path alternative to SPI and 2nd where I added per-statement resource caching. Doing the latter turned out to be not so hard thanks to the structure you chose to build the core fast path. Good call on adding the RLS and ACL test cases, btw. So, 0001 is a functionally complete fast path: concurrency handling, REPEATABLE READ crosscheck, cross-type operators, security context, and metadata caching. 0002 implements the per-statement resource caching we discussed, though instead of sharing the EState between trigger.c and ri_triggers.c it uses a new AfterTriggerBatchCallback mechanism that fires at the end of each trigger-firing cycle (per-statement for immediate constraints, or until COMMIT for deferred ones). It layers resource caching on top so that the PK relation, index, scan descriptor, and snapshot stay open across all FK trigger invocations within a single trigger-firing cycle rather than being opened and closed per row. Note that phe previous 0002 (metadata caching) is folded into 0001, and most of the new fast-path logic added in 0001 now lives in ri_FastPathCheck() rather than inline in RI_FKey_check(), so the RI_FKey_check diff is just the gating call and SPI fallback. I re-ran the benchmarks (same test as yours, different machine): create table pk (a numeric primary key); create table fk (a bigint references pk); insert into pk select generate_series(1, 2000000); insert into fk select generate_series(1, 2000000, 2); master: 2444 ms (median of 3 runs) 0001: 1382 ms (43% faster) 0001+0002: 1202 ms (51% faster, 13% over 0001 alone) Also, with int PK / int FK (1M rows): create table pk (a int primary key); create table fk (a int references pk); insert into pk select generate_series(1, 1000000); insert into fk select generate_series(1, 1000000); master: 1000 ms 0001: 520 ms (48% faster) 0001+0002: 432 ms (57% faster, 17% over 0001 alone) The incremental gain from 0002 comes from eliminating per-row relation open/close, scan begin/end, slot alloc/free, and replacing per-row GetSnapshotData() with only curcid adjustment on the registered snapshot copy in the cache. The two current limitations are partitioned referenced tables and temporal foreign keys. Partitioned PKs are relatively uncommon in practice, so the non-partitioned case should cover most FK workloads, so I'm not sure it's worth the added complexity to support them. Temporal FKs are inherently multi-row, so they're a poor fit for a single-probe fast path. David Rowley mentioned off-list that it might be worth batching multiple FK values into a single index probe, leveraging the ScalarArrayOp btree improvements from PostgreSQL 17. The idea would be to buffer FK values across trigger invocations in the per-constraint cache (0002 already has the right structure for this), build a SK_SEARCHARRAY scan key, and let the btree AM walk the matching leaf pages in one sorted traversal instead of one tree descent per row. The locking and recheck would still be per-tuple, but the index traversal cost drops significantly. Single-column FKs are the obvious starting point. That seems worth exploring but can be done as a separate patch on top of this. I think the series is in reasonable shape but would appreciate extra eyeballs, especially on the concurrency handling in ri_LockPKTuple() in 0001 and the snapshot lifecycle in 0002. Or anything else that catches one's eye. -- Thanks, Amit Langote
v3-0001-Add-fast-path-for-foreign-key-constraint-checks.patch
Description: Binary data
v3-0002-Cache-per-batch-resources-for-fast-path-foreign-k.patch
Description: Binary data
