Ewan Young <[email protected]> wrote: > On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <[email protected]> wrote: > > > > Ewan Young <[email protected]> wrote: > > > > > The transient heap built by make_new_heap() is intentionally created > > > without the old table's defaults and constraints, so it has no generation > > > expressions for its generated columns, even though the tuple descriptor > > > still has attgenerated set. > > > > > > When apply_concurrent_update() replays a non-HOT update, it calls > > > ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass > > > the "indexUnchanged" hint, that calls index_unchanged_by_update() -> > > > ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the > > > generation expression of each generated column via build_column_default() > > > and errors out when it finds none on the transient heap. > > > > > > The apply path does not need to recompute generated columns at all: the > > > decoded tuple already carries the correct value, and it is only inserted. > > > Note also that ExecGetUpdatedCols() already returns an empty set for this > > > ResultRelInfo, because it is not part of any range table -- so the > > > indexUnchanged determination here is already approximate. > > > > I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() returns > > an > > empty set is an omission rather than deliberate choice. Assuming we fix > > that, > > the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy the > > related pg_attrdef entries, as I suggest in this patch. > > > > Another question is how serious problem it is that ExecGetUpdatedCols() > > returns empty set. AFAICS, "indexUnchanged" does not affect correctness - > > it's > > is only a hint that helps the btree AM decide whent the bottom-up deletion > > and > > de-duplication techniques should (not) be used. I'm not sure it's easy to > > update the set for individual UPDATEs: the UPDATE commands REPACK replays > > originate from different SQL queries and the logical decoding does not > > transfer this information. > > > > Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return a > > set containing all the attributes rather than empty set. That is, avoid > > using > > the btree optimizations altogether rather than allow them them when not > > appropriate. However, per index_unchanged_by_update(), if > > ExecGetUpdatedCols() > > tells that all columns are updated, the result of ExecGetExtraUpdatedCols() > > does not matter. Nevertheless, I'd still slightly prefer copying the > > pg_attrdef entries to hacking the executor. > > Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols() > set was the weak point of my version -- it's an omission, not something to > build a second approximation on. Copying the pg_attrdef entries fixes the > inconsistency at the root, so let's go with your approach. > > I applied the patch and ran it through an injection-point reproducer > (cassert). Without the fix the bug reproduces (ERROR: no generation > expression found for column number 3 ...); with it, REPACK CONCURRENTLY > succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an > index directly on the generated column, and a VIRTUAL column, with correct > values afterwards. Your repack.spec change passes. > > The approach is right and I've confirmed it fixes the bug, so +1 from me in > this direction.
Thanks for checking. Here, 0002 tries to fix the problem of empty updatedCols, as I proposed above. 0001 fixes two minor coding issues that I found when writing 0002. -- Antonin Houska Web: https://www.cybertec-postgresql.com
>From 620b806eb50738837bba2f8c0cbf838bc360c61f Mon Sep 17 00:00:00 2001 From: Antonin Houska <[email protected]> Date: Wed, 1 Jul 2026 14:58:29 +0200 Subject: [PATCH 1/2] Minor cleanup in initialize_change_context(). First, use makeNode() to create an instance of ResultRelInfo, like we do elsewhere in the tree. Second, pass NULL for the 'partition_root_rri' argument of InitResultRelInfo instead of 0. --- src/backend/commands/repack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c index 4d177c868bb..c9b0c047477 100644 --- a/src/backend/commands/repack.c +++ b/src/backend/commands/repack.c @@ -3010,8 +3010,8 @@ initialize_change_context(ChangeContext *chgcxt, /* Only initialize fields needed by ExecInsertIndexTuples(). */ chgcxt->cc_estate = CreateExecutorState(); - chgcxt->cc_rri = (ResultRelInfo *) palloc(sizeof(ResultRelInfo)); - InitResultRelInfo(chgcxt->cc_rri, relation, 0, 0, 0); + chgcxt->cc_rri = makeNode(ResultRelInfo); + InitResultRelInfo(chgcxt->cc_rri, relation, 0, NULL, 0); ExecOpenIndices(chgcxt->cc_rri, false); /* -- 2.52.0
>From b6e397e7d9d81ab0d13f31cc236a0e04b6c2d5bd Mon Sep 17 00:00:00 2001 From: Antonin Houska <[email protected]> Date: Wed, 1 Jul 2026 15:18:10 +0200 Subject: [PATCH 2/2] Provide the executor with information on updated columns. When replaying data changes, REPACK calls ExecInsertIndexTuples() for each INSERT and UPDATE. In the latter case, the function needs to determine the value of the 'indexUnchanged' hint for the index AM. (The hint is always false for INSERT.) Therefore, REPACK is supposed to specify which columns are changed by the UPDATE. So far, REPACK missed to specify the set of updated columns, so the 'indexUnchanged' hint could incorrectly evaluate to true. Although it should not affect correctness, it can make the index AM use optimizations that are not appropriate. Ideally, we should compute the set of updated columns for each individual UPDATE, however the comparison of the old and new tuple might add too much overhead. This patch initializes the set as if all columns were updated. Thus 'indexUnchanged' always evaluates to false. This way the index AM never uses the related optimizations. It might result in worse structure of the index, however it seems better to not use the optimizations than to misuse them. --- src/backend/commands/repack.c | 40 ++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c index c9b0c047477..f37bb9a21af 100644 --- a/src/backend/commands/repack.c +++ b/src/backend/commands/repack.c @@ -62,6 +62,7 @@ #include "libpq/pqmq.h" #include "miscadmin.h" #include "optimizer/optimizer.h" +#include "parser/parse_relation.h" #include "pgstat.h" #include "replication/logicalrelation.h" #include "storage/bufmgr.h" @@ -3005,13 +3006,50 @@ static void initialize_change_context(ChangeContext *chgcxt, Relation relation, Oid ident_index_id) { + Bitmapset *updatedCols = NULL; + RangeTblEntry *rte; + List *perminfos = NIL; + RTEPermissionInfo *perminfo; + chgcxt->cc_rel = relation; /* Only initialize fields needed by ExecInsertIndexTuples(). */ chgcxt->cc_estate = CreateExecutorState(); + /* + * Initialize updatedCols. + * + * The point is that ExecInsertIndexTuples() should not pass the + * indexUnchanged hint to the index AM unless there's a reason to do + * so. For simplicity, we consider all columns updated. + * + * XXX Should we spend more effort to compare the old and new tuple when + * replaying UPDATE, or at least exclude unchanged TOAST values, like we + * do in logicalrep_write_tuple()? + */ + for (int i = 0; i < RelationGetDescr(relation)->natts; i++) + updatedCols = bms_add_member(updatedCols, + i + 1 - FirstLowInvalidHeapAttributeNumber); + + /* + * In this case, RTE only needs to have ->perminfoindex initialized, but + * there's no reason to not set the fields whose values we have at hand. + */ + rte = makeNode(RangeTblEntry); + rte->rtekind = RTE_RELATION; + rte->relid = RelationGetRelid(relation); + rte->relkind = RelationGetForm(relation)->relkind; + /* Create the RTEPermissionInfo instance (and set ->perminfoindex). */ + addRTEPermissionInfo(&perminfos, rte); + /* Make updatedCols available to the executor functions. */ + perminfo = getRTEPermissionInfo(perminfos, rte); + perminfo->updatedCols = updatedCols; + + ExecInitRangeTable(chgcxt->cc_estate, list_make1(rte), perminfos, + bms_make_singleton(1)); + chgcxt->cc_rri = makeNode(ResultRelInfo); - InitResultRelInfo(chgcxt->cc_rri, relation, 0, NULL, 0); + InitResultRelInfo(chgcxt->cc_rri, relation, 1, NULL, 0); ExecOpenIndices(chgcxt->cc_rri, false); /* -- 2.52.0
