Hi Antonin, On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <[email protected]> wrote: > > Ewan Young <[email protected]> wrote: > > > The transient heap built by make_new_heap() is intentionally created > > without the old table's defaults and constraints, so it has no generation > > expressions for its generated columns, even though the tuple descriptor > > still has attgenerated set. > > > > When apply_concurrent_update() replays a non-HOT update, it calls > > ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass > > the "indexUnchanged" hint, that calls index_unchanged_by_update() -> > > ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the > > generation expression of each generated column via build_column_default() > > and errors out when it finds none on the transient heap. > > > > The apply path does not need to recompute generated columns at all: the > > decoded tuple already carries the correct value, and it is only inserted. > > Note also that ExecGetUpdatedCols() already returns an empty set for this > > ResultRelInfo, because it is not part of any range table -- so the > > indexUnchanged determination here is already approximate. > > I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() returns an > empty set is an omission rather than deliberate choice. Assuming we fix that, > the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy the > related pg_attrdef entries, as I suggest in this patch. > > Another question is how serious problem it is that ExecGetUpdatedCols() > returns empty set. AFAICS, "indexUnchanged" does not affect correctness - it's > is only a hint that helps the btree AM decide whent the bottom-up deletion and > de-duplication techniques should (not) be used. I'm not sure it's easy to > update the set for individual UPDATEs: the UPDATE commands REPACK replays > originate from different SQL queries and the logical decoding does not > transfer this information. > > Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return a > set containing all the attributes rather than empty set. That is, avoid using > the btree optimizations altogether rather than allow them them when not > appropriate. However, per index_unchanged_by_update(), if ExecGetUpdatedCols() > tells that all columns are updated, the result of ExecGetExtraUpdatedCols() > does not matter. Nevertheless, I'd still slightly prefer copying the > pg_attrdef entries to hacking the executor.
Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols() set was the weak point of my version -- it's an omission, not something to build a second approximation on. Copying the pg_attrdef entries fixes the inconsistency at the root, so let's go with your approach. I applied the patch and ran it through an injection-point reproducer (cassert). Without the fix the bug reproduces (ERROR: no generation expression found for column number 3 ...); with it, REPACK CONCURRENTLY succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an index directly on the generated column, and a VIRTUAL column, with correct values afterwards. Your repack.spec change passes. The approach is right and I've confirmed it fixes the bug, so +1 from me in this direction. > > -- > Antonin Houska > Web: https://www.cybertec-postgresql.com > Regards, Ewan Young
