On 07.08.24 00:46, Greg Sabino Mullane wrote:
Currently, initdb only enables data checksums if passed the --data-checksums or -k argument. There was some hesitation years ago when this feature was first added, leading to the current situation where the default is off. However, many years later, there is wide consensus that this is an extraordinarily safe, desirable setting. Indeed, most (if not all) of the major commercial and open source Postgres systems currently turn this on by default. I posit you would be hard-pressed to find many systems these days in which it has NOT been turned on. So basically we have a de-facto standard, and I think it's time we flipped the switch to make it on by default.

I'm sympathetic to this proposal, but I want to raise some concerns.

My understanding was that the reason for some hesitation about adopting data checksums was the performance impact. Not the checksumming itself, but the overhead from hint bit logging. The last time I looked into that, you could get performance impacts on the order of 5% tps. Maybe that's acceptable, and you of course can turn it off if you want the extra performance. But I think this should be discussed in this thread.

About the claim that it's already the de-facto standard. Maybe that is approximately true for "serious" installations. But AFAICT, the popular packagings don't enable checksums by default, so there is likely a significant middle tier between "just trying it out" and serious production use that don't have it turned on.

For those uses, this change would render pg_upgrade useless for upgrades from an old instance with default settings to a new instance with default settings. And then users would either need to re-initdb with checksums turned back off, or I suppose run pg_checksums on the old instance before upgrading? This is significant additional complication. And packagers who have built abstractions on top of pg_upgrade (such as Debian pg_upgradecluster) would also need to implement something to manage this somehow.

So I think we need to think through the upgrade experience a bit more. Unfortunately, pg_checksums hasn't gotten to the point that we were perhaps once hoping for that you could enable checksums on a live system. I'm thinking pg_upgrade could have a mode where it adds the checksum during the upgrade as it copies the files (essentially a subset of pg_checksums). I think that would be useful for that middle tier of users who just want a good default experience.



Reply via email to