On Wed, Mar 2, 2022 at 3:00 PM Andres Freund <and...@anarazel.de> wrote: > What I am stuck on is what we can do for the released branches. Data > corruption after two consecutive ALTER DATABASE SET TABLESPACEs seems like > something we need to address.
I think we should consider back-porting the ProcSignalBarrier stuff eventually. I realize that it's not really battle-tested yet and I am not saying we should do it right now, but I think that if we get these changes into v15, back-porting it in let's say May of next year could be a reasonable thing to do. Sure, there is some risk there, but on the other hand, coming up with completely different fixes for the back-branches is not risk-free either, nor is it clear that there is any alternative fix that is nearly as good. In the long run, I am fairly convinced that ProcSignalBarrier is the way forward not only for this purpose but for other things as well, and everybody's got to get on the train or be left behind. Also, I am aware of multiple instances where the project waited a long, long time to fix bugs because we didn't have a back-patchable fix. I disagree with that on principle. A master-only fix now is better than a back-patchable fix two or three years from now. Of course a back-patchable fix now is better still, but we have to pick from the options we have, not the ones we'd like to have. <digressing a bit> It seems to me that if we were going to try to construct an alternative fix for the back-branches, it would have to be something that didn't involve a new invalidation mechanism -- because the ProcSignalBarrier stuff is an invalidation mechanism in effect, and I feel that it can't be better to invent two new invalidation mechanisms rather than one. And the only idea I have is trying to detect a dangerous sequence of operations and just outright block it. We have some cases sort of like that already - e.g. you can't prepare a transaction if it's done certain things. But, the existing precedents that occur to me are, I think, all cases where all of the related actions are being performed in the same backend. It doesn't sound crazy to me to have some rule like "you can't ALTER TABLESPACE on the same tablespace in the same backend twice in a row without an intervening checkpoint", or whatever, and install the book-keeping to enforce that. But I don't think anything like that can work, both because the two ALTER TABLESPACE commands could be performed in different sessions, and also because an intervening checkpoint is no guarantee of safety anyway, IIUC. So I'm just not really seeing a reasonable strategy that isn't basically the barrier stuff. -- Robert Haas EDB: http://www.enterprisedb.com