On Wed, Jul 1, 2026 at 9:57 PM Antonin Houska <[email protected]> wrote: > > Ewan Young <[email protected]> wrote: > > > On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <[email protected]> wrote: > > > > > > Ewan Young <[email protected]> wrote: > > > > > > > The transient heap built by make_new_heap() is intentionally created > > > > without the old table's defaults and constraints, so it has no > > > > generation > > > > expressions for its generated columns, even though the tuple descriptor > > > > still has attgenerated set. > > > > > > > > When apply_concurrent_update() replays a non-HOT update, it calls > > > > ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass > > > > the "indexUnchanged" hint, that calls index_unchanged_by_update() -> > > > > ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the > > > > generation expression of each generated column via > > > > build_column_default() > > > > and errors out when it finds none on the transient heap. > > > > > > > > The apply path does not need to recompute generated columns at all: the > > > > decoded tuple already carries the correct value, and it is only > > > > inserted. > > > > Note also that ExecGetUpdatedCols() already returns an empty set for > > > > this > > > > ResultRelInfo, because it is not part of any range table -- so the > > > > indexUnchanged determination here is already approximate. > > > > > > I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() > > > returns an > > > empty set is an omission rather than deliberate choice. Assuming we fix > > > that, > > > the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy > > > the > > > related pg_attrdef entries, as I suggest in this patch. > > > > > > Another question is how serious problem it is that ExecGetUpdatedCols() > > > returns empty set. AFAICS, "indexUnchanged" does not affect correctness - > > > it's > > > is only a hint that helps the btree AM decide whent the bottom-up > > > deletion and > > > de-duplication techniques should (not) be used. I'm not sure it's easy to > > > update the set for individual UPDATEs: the UPDATE commands REPACK replays > > > originate from different SQL queries and the logical decoding does not > > > transfer this information. > > > > > > Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return > > > a > > > set containing all the attributes rather than empty set. That is, avoid > > > using > > > the btree optimizations altogether rather than allow them them when not > > > appropriate. However, per index_unchanged_by_update(), if > > > ExecGetUpdatedCols() > > > tells that all columns are updated, the result of > > > ExecGetExtraUpdatedCols() > > > does not matter. Nevertheless, I'd still slightly prefer copying the > > > pg_attrdef entries to hacking the executor. > > > > Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols() > > set was the weak point of my version -- it's an omission, not something to > > build a second approximation on. Copying the pg_attrdef entries fixes the > > inconsistency at the root, so let's go with your approach. > > > > I applied the patch and ran it through an injection-point reproducer > > (cassert). Without the fix the bug reproduces (ERROR: no generation > > expression found for column number 3 ...); with it, REPACK CONCURRENTLY > > succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an > > index directly on the generated column, and a VIRTUAL column, with correct > > values afterwards. Your repack.spec change passes. > > > > The approach is right and I've confirmed it fixes the bug, so +1 from me in > > this direction. > > Thanks for checking. Here, 0002 tries to fix the problem of empty updatedCols, > as I proposed above. > > 0001 fixes two minor coding issues that I found when writing 0002.
Both look good to me. 0001 is a clear improvement — makeNode() is the right idiom, and NULL rather than 0 for the pointer argument. For 0002, I agree with the approach: since indexUnchanged only feeds the btree bottom-up-deletion / dedup heuristics, treating all columns as updated (so the hint is always false) is the safe choice — better to forgo those optimizations than to misuse them. I checked the mechanism against the current code and it holds: with ri_RangeTableIndex = 1, ExecGetUpdatedCols() returns the full set, so index_unchanged_by_update() always sees a changed key column. +1 on both. > > -- > Antonin Houska > Web: https://www.cybertec-postgresql.com > -- Regards, Ewan Young
