Try to handle torn reads of pg_control in frontend. Some of our src/bin tools read the control file without any kind of interlocking against concurrent writes from the server. At least ext4 and ntfs can expose partially modified contents when you do that.
For now, we'll try to tolerate this by retrying up to 10 times if the checksum doesn't match, until we get two reads in a row with the same bad checksum. This is not guaranteed to reach the right conclusion, but it seems very likely to. Thanks to Tom Lane for this suggestion. Various ideas for interlocking or atomicity were considered too complicated, unportable or expensive given the lack of field reports, but remain open for future reconsideration. Back-patch as far as 12. It doesn't seem like a good idea to put a heuristic change for a very rare problem into the final release of 11. Reviewed-by: Anton A. Melnikov <aamelni...@inbox.ru> Reviewed-by: David Steele <da...@pgmasters.net> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de Branch ------ REL_13_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/67060be3df34f451591f745ab942fe19addd385f Modified Files -------------- src/common/controldata_utils.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)