On Sat, Feb 14, 2026, at 2:39 PM, Jeff Davis wrote: > On Fri, 2026-02-13 at 16:06 -0500, Greg Burd wrote: >> Here's my thinking, this patch set can be thought of as: >> >> a) moving HeapDetermineColumnsInfo() into the executor
Hey Jeff, thanks for taking a look at this! :) > This feels like the core of the series: moving the logic into the > executor make it possible to be smarter about whether HOT can be > applied or not. Yes, but put a different way this change moves what is not heap-specific outside the heap and leaves heap-isms inside heap. > I think that is a good direction to go. I don't think HOT is > fundamentally a heap concept because other AMs may do something similar > and would want similar information. The ability for a table AM to skip writing new index entries for a subset of updates is the non-heap-specific feature I'm making more general in this patch. Heap calls this "HOT" (heap-only tuple) but while the name is a heap-ism, the concept isn't. Other table AM implementations might have different behavior as they might implement MVCC differently, have UNDO logs, or even be Index-oriented tables (IoT). What's baked into the code today is very tightly aligned with how heap works. The first thing I'd like to adjust is the identification of the "modified indexed attributes" on update. The second will be the TU_UpdateIndexes enum. > There are two parts to the decision > of whether to use HOT: first, is there a logical change to the indexed > value; and second, does the AM need to break the change for some other > reason (e.g. the current page is full). I view this as, "agreeing on which indexes require index writes during an update" and I think there are three parts of the system that work together to determine this: 1. indexes 2. types 3. tables (1) Indexes influence this decision when they are summarizing. Otherwise it is assumed they only need updates when the table AM signals TU_All. I think indexes need more influence over this, but that's for a later commit. (2) Types don't influence this decision today. Their equality operators are not used, attributes are memcmp(). This is a requirement for index-only scans. I think it's insufficient for types like JSONB which have internal structure that can be extracted to form index keys, but that's for a later commit. (3) Tables, really only heap today, own this decision in our code as it stands now. The heap informs the executor: TU_All, TU_None, TU_Summarizing (all, none, some). The choice between these three outcomes is the "HOT decision" in the heap_update() code. All indexes are updated when any indexed attribute is modified. No indexes are updated when there are no modifications to indexed attributes. If only attributes referenced by summarizing indexes were modified then all summarizing indexes are updated (even those without modified attributes). Of course if the newly updated tuple (even after being compressed/TOASTed) won't fit on the same page as the old one then the result is TU_All. This is where HEAD's logic is today. Future AMs, or even heap, might be able to: - chain updates across pages - serve as the primary key index as well as tuple storage (IoT) Indexes might be able to: - identify that they only index portions of the key datum provided - ask types if that portion of the data changed or not Types might be able to: - record how they've been indexed on a relation - record during mutation on update if the changes intersect with what's provided to indexes But none of that is in this patch now. The only change that might be useful is to avoid updating unmodified summarized indexes on a HOT/summarized update. That's a simple addition on top of this patch, but not in the attached patch. > It seems reasonable to me that the executor is the right place for the > first check, because it can be more precise. The motivating example > here is a JSON document where one field is indexed and unrelated fields > are being updated, in which case we can still do a HOT update because > the indexed value isn't actually changing. Yes, JSON is a good example of how a type plays a role in deciding which indexes need updates. That's where this thread started, expression indexes on JSONB that are unchanged during update should allow HOT updates when unchanged (IMO). > As far as the patch itself, it seems like you're moving a lot of code > out of heap_update() and into simple_heap_update() & > heapam_tuple_update(). Can you explain why that's needed? Perhaps I > just need to look closer. I can see how this might be confusing, you're asking a good question. Why not just add the mix_attrs as an argument to the table AM update call and be done? 1. Bitmapsets that are NULL mean empty, so how would simple_heap_update() signal to heap_update() that it needs to determine the modified indexed attributes? We'd have to add a bool along with the mix_attrs Bitmapset to indicate: "we've not calculated the set yet, you need to do that." 2. After fetching the exclusive buffer lock there is the test `!ItemIdIsNormal(lp)` to cover the case where a simple_heap_update() the otid origin is the syscache, there is no pin or snapshot, and so there might be LP_* states other than LP_NORMAL due to concurrent pruning. This only happens when updating catalog tuples, so this logic need not be present at all in the heapam_tuple_update(). Yes, the if() branch will be fast (frequently predicted by the CPU) but this feels like logic specific to the update of catalog tuples. 3. HeapDetermineColumnsInfo() actually does more than find the modified indexed attributes, it also performs half of the check for the requirement to WAL log the replica identity attributes. The replacement function in the executor doesn't do this work, so that is coded into heapam_tuple_update() but not simple_heap_update(). The second half is in ExtractReplicaIdentity() that happens later in the heap_update() function after determining if HOT is a possibility or not. I have moved these changes back into heap_update(), add the mix_attrs and mix_attrs_valid to see how things look, that's the attached patch. >> b) all that HOT nonsense >> >> I feel that (a) has value even without (b), that removing a chunk of >> work from within an exclusive buffer lock to outside that lock is a >> Good Thing(TM) and could in this case result in more concurrency. > > Right. It would be nice to see some numbers here, but moving work > outside of the buffer lock seems like a good idea. No numbers yet. >> To that end, present to you a single patch that *only* does (a), it >> moves the logic of HeapDeterminColumnsInfo() into the executor and >> doesn't change anything else. Meaning that what goes HOT today >> (without this patch), should continue to be HOT tomorrow (with this >> patch) and nothing else. > > Why are there test diffs? Removed test differences. > Also, does this move us (slightly) closer to PHOT? I'd argue that it does move the code closer to something WARM/PHOT-like meaning an ability to only update a subset of indexes on update; a change from "all/none/some" to "the required subset, and no more." That's a ways off. > Regards, > Jeff Davis thanks again for looking Jeff, -greg
From 086a73cdd9dd2dad359fb283f9ed992739f8b666 Mon Sep 17 00:00:00 2001 From: Greg Burd <[email protected]> Date: Sun, 2 Nov 2025 11:36:20 -0500 Subject: [PATCH v20260215] Idenfity modified indexed attributes in the executor on UPDATE Refactor executor update logic to determine which indexed columns have actually changed during an UPDATE operation rather than leaving this up to HeapDetermineColumnsInfo() in heap_update(). ExecWhichIndexesRequireUpdates() replaces HeapDeterminesColumnsInfo() and is called before table_tuple_update() crucially without the need for an exclusive buffer lock on the page that holds the tuple being updated. This reduces the time the lock is held later within heapam_tuple_update() and heap_update(). Besides identifying the set of modified indexed attributes HeapDetermineColumnsInfo() was also responsible for part of the logic involed in the decision to include the replica identity key or not. This now happens in heap_update() when modified_attrs_valid is false. Catalog tuple updates use simple_heap_update() and don't pass in a modified_attrs Bitmapset indicated by the modified_attrs_valid bool set to false. Updates stemming from logical replication now avoid calling HeapDetermineColumnsInfo() as well. The modified indexed attributes for these updates are now simply the intersection of the attrbutes returned from slot_modify_data() with the set of indexed attributes on a relation passed on to the table AM update via ri_ChangedIndexedCols and into ExecSimpleRelationUpdate() which calls simple_table_tuple_update() and then table_tuple_update() with that as the modified indexed attributes Bitmapset. --- src/backend/access/heap/heapam.c | 75 ++++++++-- src/backend/access/heap/heapam_handler.c | 7 +- src/backend/access/table/tableam.c | 5 +- src/backend/executor/execMain.c | 1 + src/backend/executor/execReplication.c | 7 + src/backend/executor/nodeModifyTable.c | 169 +++++++++++++++++++++-- src/backend/replication/logical/worker.c | 69 ++++++++- src/backend/utils/cache/relcache.c | 15 ++ src/include/access/heapam.h | 1 + src/include/access/tableam.h | 8 +- src/include/catalog/index.h | 1 + src/include/executor/executor.h | 3 + src/include/nodes/execnodes.h | 6 + src/include/utils/rel.h | 1 + src/include/utils/relcache.h | 1 + 15 files changed, 339 insertions(+), 30 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 98d53caeea8..d78e8ae9150 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -3311,6 +3311,7 @@ TM_Result heap_update(Relation relation, const ItemPointerData *otid, HeapTuple newtup, CommandId cid, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, + const Bitmapset *modified_attrs, bool modified_attrs_valid, TU_UpdateIndexes *update_indexes) { TM_Result result; @@ -3320,7 +3321,6 @@ heap_update(Relation relation, const ItemPointerData *otid, HeapTuple newtup, Bitmapset *key_attrs; Bitmapset *id_attrs; Bitmapset *interesting_attrs; - Bitmapset *modified_attrs; ItemId lp; HeapTupleData oldtup; HeapTuple heaptup; @@ -3346,6 +3346,7 @@ heap_update(Relation relation, const ItemPointerData *otid, HeapTuple newtup, bool checked_lockers; bool locker_remains; bool id_has_external = false; + bool rep_id_key_required = false; TransactionId xmax_new_tuple, xmax_old_tuple; uint16 infomask_old_tuple, @@ -3487,9 +3488,65 @@ heap_update(Relation relation, const ItemPointerData *otid, HeapTuple newtup, * new tuple so we must include it as part of the old_key_tuple. See * ExtractReplicaIdentity. */ - modified_attrs = HeapDetermineColumnsInfo(relation, interesting_attrs, - id_attrs, &oldtup, - newtup, &id_has_external); + if (!modified_attrs_valid) + { + modified_attrs = HeapDetermineColumnsInfo(relation, interesting_attrs, + id_attrs, &oldtup, + newtup, &id_has_external); + rep_id_key_required = bms_overlap(modified_attrs, id_attrs) || + id_has_external; + } + else + { + rep_id_key_required = bms_overlap(modified_attrs, id_attrs); + if (!rep_id_key_required) + { + Bitmapset *attrs; + TupleDesc tupdesc = RelationGetDescr(relation); + int attidx = -1; + + /* + * We don't own idx_attrs so we'll copy it and remove the modified + * set to reduce the attributes we need to test in the while loop + * and avoid a two branches in the loop. + */ + attrs = bms_difference(interesting_attrs, modified_attrs); + attrs = bms_int_members(attrs, id_attrs); + + while ((attidx = bms_next_member(attrs, attidx)) >= 0) + { + /* + * attidx is zero-based, attrnum is the normal attribute + * number + */ + AttrNumber attrnum = attidx + FirstLowInvalidHeapAttributeNumber; + Datum value; + bool isnull; + + /* + * System attributes are not added into interesting_attrs in + * relcache + */ + Assert(attrnum > 0); + + value = heap_getattr(&oldtup, attrnum, tupdesc, &isnull); + + /* No need to check attributes that can't be stored externally */ + if (isnull || + TupleDescCompactAttr(tupdesc, attrnum - 1)->attlen != -1) + continue; + + /* Check if the old tuple's attribute is stored externally */ + if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value))) + { + rep_id_key_required = true; + break; + } + } + + bms_free(attrs); + } + } /* * If we're not updating any "key" column, we can grab a weaker lock type. @@ -3763,7 +3820,7 @@ l2: bms_free(sum_attrs); bms_free(key_attrs); bms_free(id_attrs); - bms_free(modified_attrs); + /* modified attrs is passed in and free'd by the caller, or NULL */ bms_free(interesting_attrs); return result; } @@ -4111,8 +4168,7 @@ l2: * columns are modified or it has external data. */ old_key_tuple = ExtractReplicaIdentity(relation, &oldtup, - bms_overlap(modified_attrs, id_attrs) || - id_has_external, + rep_id_key_required, &old_key_copied); /* NO EREPORT(ERROR) from here till changes are logged */ @@ -4278,7 +4334,7 @@ l2: bms_free(sum_attrs); bms_free(key_attrs); bms_free(id_attrs); - bms_free(modified_attrs); + /* modified attrs is passed in and free'd by the caller, or NULL */ bms_free(interesting_attrs); return TM_Ok; @@ -4562,7 +4618,8 @@ simple_heap_update(Relation relation, const ItemPointerData *otid, HeapTuple tup result = heap_update(relation, otid, tup, GetCurrentCommandId(true), InvalidSnapshot, true /* wait for commit */ , - &tmfd, &lockmode, update_indexes); + &tmfd, &lockmode, + NULL, false, update_indexes); switch (result) { case TM_SelfModified: diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index cbef73e5d4b..2d74fa90c7f 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -312,12 +312,11 @@ heapam_tuple_delete(Relation relation, ItemPointer tid, CommandId cid, return heap_delete(relation, tid, cid, crosscheck, wait, tmfd, changingPart); } - static TM_Result heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot, CommandId cid, Snapshot snapshot, Snapshot crosscheck, - bool wait, TM_FailureData *tmfd, - LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes) + bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, + const Bitmapset *modified_attrs, TU_UpdateIndexes *update_indexes) { bool shouldFree = true; HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree); @@ -328,7 +327,7 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot, tuple->t_tableOid = slot->tts_tableOid; result = heap_update(relation, otid, tuple, cid, crosscheck, wait, - tmfd, lockmode, update_indexes); + tmfd, lockmode, modified_attrs, true, update_indexes); ItemPointerCopy(&tuple->t_self, &slot->tts_tid); /* diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c index dfda1af412e..42acd5b17a9 100644 --- a/src/backend/access/table/tableam.c +++ b/src/backend/access/table/tableam.c @@ -359,6 +359,7 @@ void simple_table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, Snapshot snapshot, + const Bitmapset *mix_attrs, TU_UpdateIndexes *update_indexes) { TM_Result result; @@ -369,7 +370,9 @@ simple_table_tuple_update(Relation rel, ItemPointer otid, GetCurrentCommandId(true), snapshot, InvalidSnapshot, true /* wait for commit */ , - &tmfd, &lockmode, update_indexes); + &tmfd, &lockmode, + mix_attrs, + update_indexes); switch (result) { diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index bfd3ebc601e..cd7ae4aeec2 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -1286,6 +1286,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, /* The following fields are set later if needed */ resultRelInfo->ri_RowIdAttNo = 0; resultRelInfo->ri_extraUpdatedCols = NULL; + resultRelInfo->ri_ChangedIndexedCols = NULL; resultRelInfo->ri_projectNew = NULL; resultRelInfo->ri_newTupleSlot = NULL; resultRelInfo->ri_oldTupleSlot = NULL; diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c index 743b1ee2b28..7040a69c275 100644 --- a/src/backend/executor/execReplication.c +++ b/src/backend/executor/execReplication.c @@ -33,6 +33,7 @@ #include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/rel.h" +#include "utils/relcache.h" #include "utils/snapmgr.h" #include "utils/syscache.h" #include "utils/typcache.h" @@ -937,7 +938,13 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo, if (rel->rd_rel->relispartition) ExecPartitionCheck(resultRelInfo, slot, estate, true); + /* + * We're not going to call ExecCheckIndexedAttrsForChanges here + * because we've already identified the changes earlier on thanks to + * slot_modify_data. + */ simple_table_tuple_update(rel, tid, slot, estate->es_snapshot, + resultRelInfo->ri_ChangedIndexedCols, &update_indexes); conflictindexes = resultRelInfo->ri_onConflictArbiterIndexes; diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 6802fc13e95..d099ec79375 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -17,6 +17,7 @@ * ExecModifyTable - retrieve the next tuple from the node * ExecEndModifyTable - shut down the ModifyTable node * ExecReScanModifyTable - rescan the ModifyTable node + * ExecCheckIndexedAttrsForChanges - find set of updated indexed columns * * NOTES * The ModifyTable node receives input from its outerPlan, which is @@ -54,6 +55,7 @@ #include "access/htup_details.h" #include "access/tableam.h" +#include "access/tupdesc.h" #include "access/xact.h" #include "commands/trigger.h" #include "executor/execPartition.h" @@ -188,6 +190,131 @@ static TupleTableSlot *ExecMergeNotMatched(ModifyTableContext *context, ResultRelInfo *resultRelInfo, bool canSetTag); +/* + * ExecCheckIndexedAttrsForChanges + * + * Determine which indexes need updating by finding the set of modified indexed + * attributes. + * + * The goal is for the executor to know, ahead of calling into the table AM to + * process the update and before calling into the index AM for inserting new + * index tuples, which attributes in the new TupleTableSlot, if any, truely + * necessitate a new index tuple. + * + * Returns a Bitmapset of attributes that intersects with indexes which require + * a new index tuple. + */ +Bitmapset * +ExecCheckIndexedAttrsForChanges(ResultRelInfo *resultRelInfo, + TupleTableSlot *old_tts, + TupleTableSlot *new_tts) +{ + int attidx = -1; + Relation relation = resultRelInfo->ri_RelationDesc; + TupleDesc tupdesc = RelationGetDescr(relation); + Bitmapset *idx_attrs; /* interesting attrs */ + Bitmapset *mix_attrs = NULL; /* modified, indexed attributes */ + + /* If no indexes, we're done */ + if (resultRelInfo->ri_NumIndices == 0) + return NULL; + + /* + * Fetch the set of all attributes referenced across all indexes on the + * relation as well as the set of attributes referenced in expressions + * that generate attributes. + */ + idx_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_INDEXED); + + /* Review all attributes referenced in indexes on this relation */ + while ((attidx = bms_next_member(idx_attrs, attidx)) >= 0) + { + /* attidx is zero-based, attnum is the normal attribute number */ + AttrNumber attnum = attidx + FirstLowInvalidHeapAttributeNumber; + Datum old_value, + new_value; + bool old_null, + new_null; + CompactAttribute *att; + + /* + * If it's a whole-tuple reference, say "not equal". It's not really + * worth supporting this case, since it could only succeed after a + * no-op update, which is hardly a case worth optimizing for. + */ + if (attnum == 0) + { + mix_attrs = bms_add_member(mix_attrs, attidx); + + continue; + } + + /* + * Likewise, automatically say "not equal" for any system attribute + * other than tableOID; we cannot expect these to be consistent in a + * HOT chain, or even to be set correctly yet in the new tuple. + */ + if (attnum < 0) + { + if (attnum != TableOidAttributeNumber) + mix_attrs = bms_add_member(mix_attrs, attidx); + + continue; + } + + /* Extract the corresponding values */ + att = TupleDescCompactAttr(tupdesc, attnum - 1); + + + /* + * Skip checking generated columns because if any of the base columns + * referenced in the generation expression have changed. If none have + * changed, the generated column value is unchanged. + */ + if (att->attgenerated) + continue; + + /* Extract values from both slots for this attribute */ + old_value = slot_getattr(old_tts, attnum, &old_null); + new_value = slot_getattr(new_tts, attnum, &new_null); + + /* A change to/from NULL, so not equal */ + if (old_null != new_null) + { + mix_attrs = bms_add_member(mix_attrs, attidx); + continue; + } + + /* Both NULL, no change/unmodified */ + if (old_null) + continue; + + /* + * We do simple binary comparison of the two datums. This may be + * overly strict because there can be multiple binary representations + * for the same logical value. But we should be OK as long as there + * are no false positives. Using a type-specific equality operator is + * messy because there could be multiple notions of equality in + * different operator classes; furthermore, we cannot safely invoke + * user-defined functions while holding exclusive buffer lock. + */ + if (attnum <= 0) + { + /* The only allowed system columns are OIDs, so do this */ + if (DatumGetObjectId(old_value) != DatumGetObjectId(new_value)) + mix_attrs = bms_add_member(mix_attrs, attidx); + } + else + { + if (!datum_image_eq(old_value, new_value, att->attbyval, att->attlen)) + mix_attrs = bms_add_member(mix_attrs, attidx); + } + } + + bms_free(idx_attrs); + + return mix_attrs; +} /* * Verify that the tuples to be produced by INSERT match the @@ -2197,14 +2324,17 @@ ExecUpdatePrepareSlot(ResultRelInfo *resultRelInfo, */ static TM_Result ExecUpdateAct(ModifyTableContext *context, ResultRelInfo *resultRelInfo, - ItemPointer tupleid, HeapTuple oldtuple, TupleTableSlot *slot, - bool canSetTag, UpdateContext *updateCxt) + ItemPointer tupleid, HeapTuple oldtuple, TupleTableSlot *oldSlot, + TupleTableSlot *slot, bool canSetTag, UpdateContext *updateCxt) { EState *estate = context->estate; Relation resultRelationDesc = resultRelInfo->ri_RelationDesc; bool partition_constraint_failed; TM_Result result; + /* The set of modified indexed attributes that trigger new index entries */ + Bitmapset *mix_attrs = NULL; + updateCxt->crossPartUpdate = false; /* @@ -2321,7 +2451,23 @@ lreplace: ExecConstraints(resultRelInfo, slot, estate); /* - * replace the heap tuple + * Identify which, if any, indexed attributes were modified here so that + * we might reuse it in a few places. + */ + bms_free(resultRelInfo->ri_ChangedIndexedCols); + resultRelInfo->ri_ChangedIndexedCols = NULL; + + /* + * Next up we need to find out the set of indexed attributes that have + * changed in value and should trigger a new index tuple. We could start + * with the set of updated columns via ExecGetUpdatedCols(), but if we do + * we will overlook attributes directly modified by heap_modify_tuple() + * which are not known to ExecGetUpdatedCols(). + */ + mix_attrs = ExecCheckIndexedAttrsForChanges(resultRelInfo, oldSlot, slot); + + /* + * Call into the table AM to update the heap tuple. * * Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check that * the row to be updated is visible to that snapshot, and throw a @@ -2335,8 +2481,12 @@ lreplace: estate->es_crosscheck_snapshot, true /* wait for commit */ , &context->tmfd, &updateCxt->lockmode, + mix_attrs, &updateCxt->updateIndexes); + Assert(bms_is_empty(resultRelInfo->ri_ChangedIndexedCols)); + resultRelInfo->ri_ChangedIndexedCols = mix_attrs; + return result; } @@ -2553,8 +2703,9 @@ ExecUpdate(ModifyTableContext *context, ResultRelInfo *resultRelInfo, */ redo_act: lockedtid = *tupleid; - result = ExecUpdateAct(context, resultRelInfo, tupleid, oldtuple, slot, - canSetTag, &updateCxt); + + result = ExecUpdateAct(context, resultRelInfo, tupleid, oldtuple, oldSlot, + slot, canSetTag, &updateCxt); /* * If ExecUpdateAct reports that a cross-partition update was done, @@ -3404,8 +3555,8 @@ lmerge_matched: Assert(oldtuple == NULL); result = ExecUpdateAct(context, resultRelInfo, tupleid, - NULL, newslot, canSetTag, - &updateCxt); + NULL, resultRelInfo->ri_oldTupleSlot, + newslot, canSetTag, &updateCxt); /* * As in ExecUpdate(), if ExecUpdateAct() reports that a @@ -3430,6 +3581,7 @@ lmerge_matched: tupleid, NULL, newslot); mtstate->mt_merge_updated += 1; } + break; case CMD_DELETE: @@ -4537,7 +4689,7 @@ ExecModifyTable(PlanState *pstate) * For UPDATE/DELETE/MERGE, fetch the row identity info for the tuple * to be updated/deleted/merged. For a heap relation, that's a TID; * otherwise we may have a wholerow junk attr that carries the old - * tuple in toto. Keep this in step with the part of + * tuple in total. Keep this in step with the part of * ExecInitModifyTable that sets up ri_RowIdAttNo. */ if (operation == CMD_UPDATE || operation == CMD_DELETE || @@ -4717,6 +4869,7 @@ ExecModifyTable(PlanState *pstate) /* Now apply the update. */ slot = ExecUpdate(&context, resultRelInfo, tupleid, oldtuple, oldSlot, slot, node->canSetTag); + if (tuplock) UnlockTuple(resultRelInfo->ri_RelationDesc, tupleid, InplaceUpdateTupleLock); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 32725c48623..968ff626f04 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -285,12 +285,14 @@ #include "storage/procarray.h" #include "tcop/tcopprot.h" #include "utils/acl.h" +#include "utils/datum.h" #include "utils/guc.h" #include "utils/inval.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/pg_lsn.h" #include "utils/rel.h" +#include "utils/relcache.h" #include "utils/rls.h" #include "utils/snapmgr.h" #include "utils/syscache.h" @@ -1110,15 +1112,18 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel, * "slot" is filled with a copy of the tuple in "srcslot", replacing * columns provided in "tupleData" and leaving others as-is. * + * Returns a bitmap of the modified columns. + * * Caution: unreplaced pass-by-ref columns in "slot" will point into the * storage for "srcslot". This is OK for current usage, but someday we may * need to materialize "slot" at the end to make it independent of "srcslot". */ -static void +static Bitmapset * slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot, LogicalRepRelMapEntry *rel, LogicalRepTupleData *tupleData) { + Bitmapset *modified = NULL; int natts = slot->tts_tupleDescriptor->natts; int i; @@ -1195,6 +1200,27 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot, slot->tts_isnull[i] = true; } + /* + * Determine if the replicated value changed the local value by + * comparing slots. This is a subset of + * ExecCheckIndexedAttrsForChanges. + */ + if (srcslot->tts_isnull[i] != slot->tts_isnull[i]) + { + /* One is NULL, the other is not so the value changed */ + modified = bms_add_member(modified, i + 1 - FirstLowInvalidHeapAttributeNumber); + } + else if (!srcslot->tts_isnull[i]) + { + /* Both are not NULL, compare their values */ + + if (!datumIsEqual(srcslot->tts_values[i], + slot->tts_values[i], + att->attbyval, + att->attlen)) + modified = bms_add_member(modified, i + 1 - FirstLowInvalidHeapAttributeNumber); + } + /* Reset attnum for error callback */ apply_error_callback_arg.remote_attnum = -1; } @@ -1202,6 +1228,8 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot, /* And finally, declare that "slot" contains a valid virtual tuple */ ExecStoreVirtualTuple(slot); + + return modified; } /* @@ -2918,6 +2946,7 @@ apply_handle_update_internal(ApplyExecutionData *edata, ConflictTupleInfo conflicttuple = {0}; bool found; MemoryContext oldctx; + Bitmapset *indexed = NULL; EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL); ExecOpenIndices(relinfo, false); @@ -2934,6 +2963,8 @@ apply_handle_update_internal(ApplyExecutionData *edata, */ if (found) { + Bitmapset *modified = NULL; + /* * Report the conflict if the tuple was modified by a different * origin. @@ -2957,15 +2988,29 @@ apply_handle_update_internal(ApplyExecutionData *edata, /* Process and store remote tuple in the slot */ oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); - slot_modify_data(remoteslot, localslot, relmapentry, newtup); + modified = slot_modify_data(remoteslot, localslot, relmapentry, newtup); MemoryContextSwitchTo(oldctx); + /* + * Normally we'd call ExecCheckIndexedAttrForChanges but here we have + * the record of changed columns in the replication state, so let's + * use that instead. + */ + indexed = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc, + INDEX_ATTR_BITMAP_INDEXED); + + bms_free(relinfo->ri_ChangedIndexedCols); + relinfo->ri_ChangedIndexedCols = bms_int_members(modified, indexed); + bms_free(indexed); + EvalPlanQualSetSlot(&epqstate, remoteslot); InitConflictIndexes(relinfo); - /* Do the actual update. */ + /* First check privileges */ TargetPrivilegesCheck(relinfo->ri_RelationDesc, ACL_UPDATE); + + /* Then do the actual update. */ ExecSimpleRelationUpdate(relinfo, estate, &epqstate, localslot, remoteslot); } @@ -3455,6 +3500,8 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, bool found; EPQState epqstate; ConflictTupleInfo conflicttuple = {0}; + Bitmapset *modified = NULL; + Bitmapset *indexed; /* Get the matching local tuple from the partition. */ found = FindReplTupleInLocalRel(edata, partrel, @@ -3523,8 +3570,8 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, * remoteslot_part. */ oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); - slot_modify_data(remoteslot_part, localslot, part_entry, - newtup); + modified = slot_modify_data(remoteslot_part, localslot, part_entry, + newtup); MemoryContextSwitchTo(oldctx); EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL); @@ -3549,6 +3596,18 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, EvalPlanQualSetSlot(&epqstate, remoteslot_part); TargetPrivilegesCheck(partrelinfo->ri_RelationDesc, ACL_UPDATE); + + /* + * Normally we'd call ExecCheckIndexedAttrForChanges but + * here we have the record of changed columns in the + * replication state, so let's use that instead. + */ + indexed = RelationGetIndexAttrBitmap(partrelinfo->ri_RelationDesc, + INDEX_ATTR_BITMAP_INDEXED); + bms_free(partrelinfo->ri_ChangedIndexedCols); + partrelinfo->ri_ChangedIndexedCols = bms_int_members(modified, indexed); + bms_free(indexed); + ExecSimpleRelationUpdate(partrelinfo, estate, &epqstate, localslot, remoteslot_part); } diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 6b634c9fff1..547cf1d054d 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -2477,6 +2477,7 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc) bms_free(relation->rd_idattr); bms_free(relation->rd_hotblockingattr); bms_free(relation->rd_summarizedattr); + bms_free(relation->rd_indexedattr); if (relation->rd_pubdesc) pfree(relation->rd_pubdesc); if (relation->rd_options) @@ -5278,6 +5279,7 @@ RelationGetIndexPredicate(Relation relation) * index (empty if FULL) * INDEX_ATTR_BITMAP_HOT_BLOCKING Columns that block updates from being HOT * INDEX_ATTR_BITMAP_SUMMARIZED Columns included in summarizing indexes + * INDEX_ATTR_BITMAP_INDEXED Columns referenced by indexes * * Attribute numbers are offset by FirstLowInvalidHeapAttributeNumber so that * we can include system attributes (e.g., OID) in the bitmap representation. @@ -5301,6 +5303,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) Bitmapset *pkindexattrs; /* columns in the primary index */ Bitmapset *idindexattrs; /* columns in the replica identity */ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */ + Bitmapset *indexedattrs; /* columns referenced by indexes */ Bitmapset *summarizedattrs; /* columns with summarizing indexes */ List *indexoidlist; List *newindexoidlist; @@ -5324,6 +5327,8 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) return bms_copy(relation->rd_hotblockingattr); case INDEX_ATTR_BITMAP_SUMMARIZED: return bms_copy(relation->rd_summarizedattr); + case INDEX_ATTR_BITMAP_INDEXED: + return bms_copy(relation->rd_indexedattr); default: elog(ERROR, "unknown attrKind %u", attrKind); } @@ -5368,6 +5373,7 @@ restart: idindexattrs = NULL; hotblockingattrs = NULL; summarizedattrs = NULL; + indexedattrs = NULL; foreach(l, indexoidlist) { Oid indexOid = lfirst_oid(l); @@ -5500,10 +5506,14 @@ restart: bms_free(idindexattrs); bms_free(hotblockingattrs); bms_free(summarizedattrs); + /* indexedattrs not yet initialized */ goto restart; } + /* Set indexed attributes to track all referenced attributes */ + indexedattrs = bms_union(hotblockingattrs, summarizedattrs); + /* Don't leak the old values of these bitmaps, if any */ relation->rd_attrsvalid = false; bms_free(relation->rd_keyattr); @@ -5516,6 +5526,8 @@ restart: relation->rd_hotblockingattr = NULL; bms_free(relation->rd_summarizedattr); relation->rd_summarizedattr = NULL; + bms_free(relation->rd_indexedattr); + relation->rd_indexedattr = NULL; /* * Now save copies of the bitmaps in the relcache entry. We intentionally @@ -5530,6 +5542,7 @@ restart: relation->rd_idattr = bms_copy(idindexattrs); relation->rd_hotblockingattr = bms_copy(hotblockingattrs); relation->rd_summarizedattr = bms_copy(summarizedattrs); + relation->rd_indexedattr = bms_copy(indexedattrs); relation->rd_attrsvalid = true; MemoryContextSwitchTo(oldcxt); @@ -5546,6 +5559,8 @@ restart: return hotblockingattrs; case INDEX_ATTR_BITMAP_SUMMARIZED: return summarizedattrs; + case INDEX_ATTR_BITMAP_INDEXED: + return indexedattrs; default: elog(ERROR, "unknown attrKind %u", attrKind); return NULL; diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index 3c0961ab36b..a56f3d1f378 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -368,6 +368,7 @@ extern TM_Result heap_update(Relation relation, const ItemPointerData *otid, HeapTuple newtup, CommandId cid, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, + const Bitmapset *mix_attrs, bool mix_attrs_valid, TU_UpdateIndexes *update_indexes); extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple, CommandId cid, LockTupleMode mode, LockWaitPolicy wait_policy, diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h index 251379016b0..3b080aa3711 100644 --- a/src/include/access/tableam.h +++ b/src/include/access/tableam.h @@ -549,6 +549,7 @@ typedef struct TableAmRoutine bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, + const Bitmapset *updated_cols, TU_UpdateIndexes *update_indexes); /* see table_tuple_lock() for reference about parameters */ @@ -1524,12 +1525,12 @@ static inline TM_Result table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, CommandId cid, Snapshot snapshot, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, - TU_UpdateIndexes *update_indexes) + const Bitmapset *mix_cols, TU_UpdateIndexes *update_indexes) { return rel->rd_tableam->tuple_update(rel, otid, slot, cid, snapshot, crosscheck, - wait, tmfd, - lockmode, update_indexes); + wait, tmfd, lockmode, + mix_cols, update_indexes); } /* @@ -2010,6 +2011,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid, Snapshot snapshot); extern void simple_table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, Snapshot snapshot, + const Bitmapset *mix_attrs, TU_UpdateIndexes *update_indexes); diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h index b259c4141ed..14a39beab6e 100644 --- a/src/include/catalog/index.h +++ b/src/include/catalog/index.h @@ -132,6 +132,7 @@ extern bool CompareIndexInfo(const IndexInfo *info1, const IndexInfo *info2, const AttrMap *attmap); extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii); +extern void BuildUpdateIndexInfo(ResultRelInfo *resultRelInfo); extern void FormIndexDatum(IndexInfo *indexInfo, TupleTableSlot *slot, diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 55a7d930d26..67ecb1771c3 100644 --- a/src/include/executor/executor.h +++ b/src/include/executor/executor.h @@ -800,5 +800,8 @@ extern ResultRelInfo *ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid, bool missing_ok, bool update_cache); +extern Bitmapset *ExecCheckIndexedAttrsForChanges(ResultRelInfo *relinfo, + TupleTableSlot *old_tts, + TupleTableSlot *new_tts); #endif /* EXECUTOR_H */ diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index 63c067d5aae..13284dbd70b 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -502,6 +502,12 @@ typedef struct ResultRelInfo /* true if the above has been computed */ bool ri_extraUpdatedCols_valid; + /* + * For UPDATE a Bitmapset of the attributes that are both indexed and have + * changed in value. + */ + Bitmapset *ri_ChangedIndexedCols; + /* Projection to generate new tuple in an INSERT/UPDATE */ ProjectionInfo *ri_projectNew; /* Slot to hold that tuple */ diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h index 236830f6b93..df5426fd7fb 100644 --- a/src/include/utils/rel.h +++ b/src/include/utils/rel.h @@ -164,6 +164,7 @@ typedef struct RelationData Bitmapset *rd_idattr; /* included in replica identity index */ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */ Bitmapset *rd_summarizedattr; /* cols indexed by summarizing indexes */ + Bitmapset *rd_indexedattr; /* all cols referenced by indexes */ PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */ diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h index 2700224939a..5834ab7b903 100644 --- a/src/include/utils/relcache.h +++ b/src/include/utils/relcache.h @@ -71,6 +71,7 @@ typedef enum IndexAttrBitmapKind INDEX_ATTR_BITMAP_IDENTITY_KEY, INDEX_ATTR_BITMAP_HOT_BLOCKING, INDEX_ATTR_BITMAP_SUMMARIZED, + INDEX_ATTR_BITMAP_INDEXED, } IndexAttrBitmapKind; extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation, -- 2.51.2
