On Tue, Mar 24, 2026 at 8:47 PM Amit Langote <[email protected]> wrote:
>
> Hi Junwang,
>
> On Fri, Mar 20, 2026 at 1:20 AM Junwang Zhao <[email protected]> wrote:
> > I squashed 0004 into 0003 so that each file can be committed independently.
> > I also runned pgindent for each file.
>
> Thanks for that.
>
> Here's another version.
>
> In 0001, I noticed that the condition change in ri_HashCompareOp could
> be simplified further.  Also improved the commentary surrounding that.
> I also updated the commit message to clarify parity with the SPI path.
>
> Updated the commit message of 0002 to talk about why caching the
> snapshot for the entire trigger firing cycle of a given constraint
> makes a trade off compared to the SPI path which retakes the snapshot
> for every row checked and could in principle avoid failure for FK rows
> whose corresponding PK row was added by a concurrently committed
> transaction, at least in the READ COMMITTED case.
>
> Updated the commit message of 0003 to clarify that it replaces
> ri_FastPathCheckCached() from 0002 with the BatchAdd/BatchFlush pair,
> and that the cached resources are used unchanged -- only the probing
> cadence changes from per-row to per-flush.  Per-flush CCI is safe
> because all AFTER triggers for the buffered rows have already fired
> by flush time; a new test case is added to show that.

Kept thinking about this on a walk after I sent this and came to the
conclusion that it might be better to just not cache the snapshot with
only the above argument in its favor.  If repeated GetSnapshotData()
is expensive, the solution should be to fix that instead of simply
side-stepping it.

By taking a snapshot per-batch without caching it, and so likewise the
IndexScanDesc, I'm seeing the same ~3x speedup in the batched
SK_SEARCHARRAY case, so I don't see much point in being very stubborn
about snapshot caching.  Like in the attached (there's an unrelated
memory context switch thinko fix).  Note that relations (pk_rel,
idx_rel) and the slot remain cached across the batch; only the
snapshot and scandesc are taken fresh per flush.

I'll post an updated version tomorrow morning.  I think it might be
better to just merge 0003 into 0002, because without snapshot and
scandesc caching the standalone value of 0002 is mostly just relation
and slot caching -- the interesting parts (batch callbacks, lifecycle
management) are all scaffolding for the batching.  So v10 will be two
patches: 0001 core fast path, 0002 everything else.

-- 
Thanks, Amit Langote
diff --git a/src/backend/utils/adt/ri_triggers.c 
b/src/backend/utils/adt/ri_triggers.c
index 993c3ac49a3..f271ffccc00 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -224,10 +224,8 @@ typedef struct RI_FastPathEntry
        Oid                     conoid;                 /* hash key: 
pg_constraint OID */
        Relation        pk_rel;
        Relation        idx_rel;
-       IndexScanDesc scandesc;
        TupleTableSlot *pk_slot;
        TupleTableSlot *fk_slot;
-       Snapshot        snapshot;               /* registered snapshot for the 
scan */
        MemoryContext scan_cxt;         /* index scan allocations */
        MemoryContext flush_cxt;        /* short-lived context for per-flush 
work */
 
@@ -301,9 +299,11 @@ static void ri_FastPathCheck(const RI_ConstraintInfo 
*riinfo,
 static void ri_FastPathBatchAdd(const RI_ConstraintInfo *riinfo,
                                                                Relation 
fk_rel, TupleTableSlot *newslot);
 static void ri_FastPathFlushArray(RI_FastPathEntry *fpentry, TupleTableSlot 
*fk_slot,
-                                                                 const 
RI_ConstraintInfo *riinfo, Relation fk_rel);
+                                                                 const 
RI_ConstraintInfo *riinfo, Relation fk_rel,
+                                                                 Snapshot 
snapshot);
 static void ri_FastPathFlushLoop(RI_FastPathEntry *fpentry, TupleTableSlot 
*fk_slot,
-                                                                const 
RI_ConstraintInfo *riinfo, Relation fk_rel);
+                                                                const 
RI_ConstraintInfo *riinfo, Relation fk_rel,
+                                                                Snapshot 
snapshot);
 static void ri_FastPathBatchFlush(RI_FastPathEntry *fpentry,
                                                                  Relation 
fk_rel);
 static bool ri_FastPathProbeOne(Relation pk_rel, Relation idx_rel,
@@ -2857,8 +2857,8 @@ ri_FastPathBatchFlush(RI_FastPathEntry *fpentry, Relation 
fk_rel)
        const RI_ConstraintInfo *riinfo = fpentry->riinfo;
        Relation        pk_rel = fpentry->pk_rel;
        Relation        idx_rel = fpentry->idx_rel;
-       Snapshot        snapshot = fpentry->snapshot;
        TupleTableSlot *fk_slot = fpentry->fk_slot;
+       Snapshot        snapshot;
        Oid                     saved_userid;
        int                     saved_sec_context;
        MemoryContext oldcxt = CurrentMemoryContext;
@@ -2882,7 +2882,7 @@ ri_FastPathBatchFlush(RI_FastPathEntry *fpentry, Relation 
fk_rel)
         * context per row whereas we do it once around the whole batch.
         */
        CommandCounterIncrement();
-       snapshot->curcid = GetCurrentCommandId(false);
+       snapshot = RegisterSnapshot(GetTransactionSnapshot());
 
        GetUserIdAndSecContext(&saved_userid, &saved_sec_context);
        SetUserIdAndSecContext(RelationGetForm(pk_rel)->relowner,
@@ -2891,11 +2891,12 @@ ri_FastPathBatchFlush(RI_FastPathEntry *fpentry, 
Relation fk_rel)
                                                   SECURITY_NOFORCE_RLS);
 
        if (riinfo->nkeys == 1)
-               ri_FastPathFlushArray(fpentry, fk_slot, riinfo, fk_rel);
+               ri_FastPathFlushArray(fpentry, fk_slot, riinfo, fk_rel, 
snapshot);
        else
-               ri_FastPathFlushLoop(fpentry, fk_slot, riinfo, fk_rel);
+               ri_FastPathFlushLoop(fpentry, fk_slot, riinfo, fk_rel, 
snapshot);
        MemoryContextSwitchTo(oldcxt);
        SetUserIdAndSecContext(saved_userid, saved_sec_context);
+       UnregisterSnapshot(snapshot);
 
        /* Free materialized tuples and reset */
        for (int i = 0; i < fpentry->batch_count; i++)
@@ -2912,29 +2913,30 @@ ri_FastPathBatchFlush(RI_FastPathEntry *fpentry, 
Relation fk_rel)
  */
 static void
 ri_FastPathFlushLoop(RI_FastPathEntry *fpentry, TupleTableSlot *fk_slot,
-                                        const RI_ConstraintInfo *riinfo, 
Relation fk_rel)
+                                        const RI_ConstraintInfo *riinfo, 
Relation fk_rel,
+                                        Snapshot snapshot)
 {
        Relation        pk_rel = fpentry->pk_rel;
        Relation        idx_rel = fpentry->idx_rel;
-       IndexScanDesc scandesc = fpentry->scandesc;
        TupleTableSlot *pk_slot = fpentry->pk_slot;
-       Snapshot        snapshot = fpentry->snapshot;
+       IndexScanDesc scandesc;
        Datum           pk_vals[INDEX_MAX_KEYS];
        char            pk_nulls[INDEX_MAX_KEYS];
        ScanKeyData skey[INDEX_MAX_KEYS];
+       bool            found = true;
+       /*
+        * build_index_scankeys() may palloc cast results for cross-type FKs.
+        * Use the entry's short-lived flush context so these don't accumulate
+        * across batches.
+        */
+       MemoryContext oldcxt = MemoryContextSwitchTo(fpentry->flush_cxt);
 
+       scandesc = index_beginscan(pk_rel, idx_rel, snapshot, NULL,
+                                                          riinfo->nkeys, 0);
        for (int i = 0; i < fpentry->batch_count; i++)
        {
-               bool            found = false;
 
                ExecStoreHeapTuple(fpentry->batch[i], fk_slot, false);
-
-               /*
-                * build_index_scankeys() may palloc cast results for 
cross-type FKs.
-                * Use the entry's short-lived flush context so these don't 
accumulate
-                * across batches.
-                */
-               MemoryContextSwitchTo(fpentry->flush_cxt);
                ri_ExtractValues(fk_rel, fk_slot, riinfo, false, pk_vals, 
pk_nulls);
                build_index_scankeys(riinfo, idx_rel, pk_vals, pk_nulls, skey);
                MemoryContextSwitchTo(fpentry->scan_cxt);
@@ -2943,11 +2945,15 @@ ri_FastPathFlushLoop(RI_FastPathEntry *fpentry, 
TupleTableSlot *fk_slot,
                                                                        
snapshot, riinfo, skey, riinfo->nkeys);
 
                if (!found)
-                       ri_ReportViolation(riinfo, pk_rel, fk_rel,
-                                                          fk_slot, NULL,
-                                                          
RI_PLAN_CHECK_LOOKUPPK, false, false);
+                       break;
        }
+       index_endscan(scandesc);
        MemoryContextReset(fpentry->flush_cxt);
+       MemoryContextSwitchTo(oldcxt);
+       if (!found)
+               ri_ReportViolation(riinfo, pk_rel, fk_rel,
+                                                  fk_slot, NULL,
+                                                  RI_PLAN_CHECK_LOOKUPPK, 
false, false);
 }
 
 /*
@@ -2962,14 +2968,14 @@ ri_FastPathFlushLoop(RI_FastPathEntry *fpentry, 
TupleTableSlot *fk_slot,
  */
 static void
 ri_FastPathFlushArray(RI_FastPathEntry *fpentry, TupleTableSlot *fk_slot,
-                                         const RI_ConstraintInfo *riinfo, 
Relation fk_rel)
+                                         const RI_ConstraintInfo *riinfo, 
Relation fk_rel,
+                                         Snapshot snapshot)
 {
        FastPathMeta *fpmeta = riinfo->fpmeta;
        Relation        pk_rel = fpentry->pk_rel;
        Relation        idx_rel = fpentry->idx_rel;
-       IndexScanDesc scandesc = fpentry->scandesc;
        TupleTableSlot *pk_slot = fpentry->pk_slot;
-       Snapshot        snapshot = fpentry->snapshot;
+       IndexScanDesc scandesc;
        Datum           search_vals[RI_FASTPATH_BATCH_SIZE];
        bool            matched[RI_FASTPATH_BATCH_SIZE];
        int                     nvals = fpentry->batch_count;
@@ -2983,16 +2989,19 @@ ri_FastPathFlushArray(RI_FastPathEntry *fpentry, 
TupleTableSlot *fk_slot,
        char            elem_align;
        ArrayType  *arr;
 
-       Assert(fpmeta);
-
-       memset(matched, 0, nvals * sizeof(bool));
-
        /*
         * Transient per-flush allocations (cast results, the search array) must
         * not accumulate across repeated flushes.  Use the entry's short-lived
         * flush context, reset after each flush.
         */
-       MemoryContextSwitchTo(fpentry->flush_cxt);
+       MemoryContext oldcxt = MemoryContextSwitchTo(fpentry->flush_cxt);
+
+       Assert(fpmeta);
+
+       memset(matched, 0, nvals * sizeof(bool));
+
+       scandesc = index_beginscan(pk_rel, idx_rel, snapshot, NULL,
+                                                          riinfo->nkeys, 0);
 
        /*
         * Extract FK values, casting to the operator's expected input type if
@@ -3103,6 +3112,11 @@ ri_FastPathFlushArray(RI_FastPathEntry *fpentry, 
TupleTableSlot *fk_slot,
                }
        }
 
+       index_endscan(scandesc);
+
+       MemoryContextReset(fpentry->flush_cxt);
+       MemoryContextSwitchTo(oldcxt);
+
        /* Report first unmatched row */
        for (int i = 0; i < nvals; i++)
        {
@@ -3114,8 +3128,6 @@ ri_FastPathFlushArray(RI_FastPathEntry *fpentry, 
TupleTableSlot *fk_slot,
                                                           
RI_PLAN_CHECK_LOOKUPPK, false, false);
                }
        }
-
-       MemoryContextReset(fpentry->flush_cxt);
 }
 
 /*
@@ -4106,9 +4118,6 @@ ri_FastPathTeardown(void)
        hash_seq_init(&status, ri_fastpath_cache);
        while ((entry = hash_seq_search(&status)) != NULL)
        {
-               /* Close both scans before closing idx_rel. */
-               if (entry->scandesc)
-                       index_endscan(entry->scandesc);
                if (entry->idx_rel)
                        index_close(entry->idx_rel, NoLock);
                if (entry->pk_rel)
@@ -4117,8 +4126,6 @@ ri_FastPathTeardown(void)
                        ExecDropSingleTupleTableSlot(entry->pk_slot);
                if (entry->fk_slot)
                        ExecDropSingleTupleTableSlot(entry->fk_slot);
-               if (entry->snapshot)
-                       UnregisterSnapshot(entry->snapshot);
                if (entry->scan_cxt)
                        MemoryContextDelete(entry->scan_cxt);
        }
@@ -4232,21 +4239,10 @@ ri_FastPathGetEntry(const RI_ConstraintInfo *riinfo, 
Relation fk_rel)
                entry->pk_rel = table_open(riinfo->pk_relid, RowShareLock);
                entry->idx_rel = index_open(riinfo->conindid, AccessShareLock);
 
-               /*
-                * Register an initial snapshot.  Its curcid will be patched in 
place
-                * on each subsequent row (see ri_FastPathBatchFlush()), 
avoiding
-                * per-row GetSnapshotData() overhead.
-                */
-               entry->snapshot = RegisterSnapshot(GetTransactionSnapshot());
-
                entry->pk_slot = table_slot_create(entry->pk_rel, NULL);
                entry->fk_slot = 
MakeSingleTupleTableSlot(RelationGetDescr(fk_rel),
                                                                                
                  &TTSOpsHeapTuple);
 
-               entry->scandesc = index_beginscan(entry->pk_rel, entry->idx_rel,
-                                                                               
  entry->snapshot, NULL,
-                                                                               
  riinfo->nkeys, 0);
-
                entry->scan_cxt = AllocSetContextCreate(TopTransactionContext,
                                                                                
                "RI fast path scan context",
                                                                                
                ALLOCSET_DEFAULT_SIZES);

Reply via email to