On Thu, May 7, 2026 at 8:54 AM Tomas Vondra <[email protected]> wrote: > > On 5/6/26 23:46, Melanie Plageman wrote: > > > Based on an LLM-assisted self-review of this code, I think there is a > > pre-existing bug in heap_lock_updated_tuple_rec() where we don't reset > > cleared_all_frozen and may have a stale value if we iterate more than > > once (new tuple versions) or use the l4 goto label (retries for the > > same tuple version). This could mean we unnecessarily clear all-frozen > > on the standby (and unnecessarily register the VM buffer after my > > patch). I couldn't come up with a repro though, so I didn't include a > > patch to fix it, as I wanted to confirm that someone else also thought > > this is a bug. > > Hmm, would this be a correctness or efficiency issue? > > I'm not very familiar with this code, so I can't say if there is an > issue or not. Could you explain the sequence of steps leading to an > issue? Maybe that assumes something that's not quite possible, or it > might help constructing a reproducer. > > I don't quite see how could it be related to the "l4" goto label, when > all the "goto l4" cases are before setting "cleared_all_frozen = true" > (in any given iteration).
You're right that it couldn't happen with goto l4. However, it is possible in the loop when walking an update chain, it can set cleared_all_frozen to true, clear the page's all frozen bits and then later, on the next iteration, clear it again (even though it's already clear). It's only an efficiency issue in master, actually (as you suggest). In my patch it would be a serious bug because it would try to register the VM page and set an LSN on that VM page when it isn't locked or marked dirty. - Melanie
