> On 26 May 2026, at 20:12, Tomas Vondra <[email protected]> wrote: > I suppose this means we should not be updating the checksum state > without emitting the barrier? I think all other places do that.
Good catch, it's indeed a bug, any state change must emit a procsignalbarrier to maintain cluster consistency. I ended up writing a test for this very case as well. > I'm still not sure if it really is an issue or just an annoyance, > because I've not been able to find a case where it'd lead to checksum > failures (or obviously incorrect final state after recovery). I've tried to get it to reach an incorrect end state but failed, but I do agree that maybe we need an improved locking protocol around state updates. Need to spend some more time thinking about this. > I still don't understand why this needs DELAY_CHKPT_START ... Having stared at this for some time, and going over old threads, I think this is a mistake. AFAICT though it cannot cause any error, so I'd lean towards erring on the safe side by leaving as is and looking at removing in 20. What do you think? > I also noticed a couple minor comment issues, per attached patch (this > may need pgindent). I ended up splitting this into two, one for the comment fixes and one for the data type change. I propose applying the three patches below to v19 to fix the promotion issue before we wrap beta1. -- Daniel Gustafsson
0003-Use-correct-datatype-for-PID.patch
Description: Binary data
0002-Improve-comments-in-online-checksums-code.patch
Description: Binary data
0001-Fix-checksum-state-transition-during-promotion.patch
Description: Binary data
