On Tue, Dec 17, 2019 at 6:40 PM Michael Paquier <mich...@paquier.xyz> wrote: > So that's how you prevent piling up multiple registrations of this > callback compared to v1. FWIW, I think that it is a cleaner approach > to remove the callback once a non-exclusive backup is done, because a > session has no need for it once it is done with its non-exclusive > backup, and this session may remain around for some time.
The fact that the session may remain around for some time isn't really relevant, because the callback isn't consuming any resources. It does not increase memory usage by a single byte, nor CPU consumption either. It does consume a few CPU cycles when the backend finally exits, but the number of such cycles is very small and unrelated to the length of the session. And removing the callback isn't entirely free, either. I think the real point for me is that it's bad to have two sources of truth. We have the sessionBackupState variable that tells us whether we're in a backup, and then we separately have whether or not the callback is registered. If those two things ever get out of sync, as they do in the test cases that I've proposed, then we have problems -- so it's better not to maintain the state in two separate ways. The way it's set up right now actually seems quite fragile even apart from the problem with cancel_before_shmem_exit(). do_pg_stop_backup() sets sessionBackupState to SESSION_BACKUP_NONE and then does things that might fail. If they do, then the cancel_before_shmem_exit() callback will leave the callback installed, which can lead to a spurious warning or assertion failure later as in the original report. The only way to avoid that problem would be to move the cancel_before_shmem_exit() callback so that it happens right next to setting sessionBackupState to SESSION_BACKUP_NONE -- note the comments there saying we can't even do WALInsertLockRelease() between updating XLogCtl and updating sessionBackupState. But that would be very ugly, because we'd then have to pass a flag to do_pg_stop_backup() saying whether to remove the callback, since only one caller wants that behavior. And, we'd have to change cancel_before_shmem_exit() to search the whole array, which would almost certainly cost more cycles than leaving a do-nothing callback around. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company