pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
REL_12_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/43c979086825e5df1816307fea9b41378cdc31bd

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)



pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
REL_13_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/67060be3df34f451591f745ab942fe19addd385f

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)



pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
REL_14_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/dc75748a918e3ba0fccf0aeb090068e1f172b3e9

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)



pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
REL_15_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/5e39884d322a7dd1dd595fbaaeb1f88c8907d3a6

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)



pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
REL_16_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/5725e4ebe7a936f724f21e7ee1e84e54a70bfd83

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)



pgsql: Try to handle torn reads of pg_control in frontend.

2023-10-15 Thread Thomas Munro
Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov 
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/63a58c6b3db2b1103ddf67a04b31a8f8e9bb

Modified Files
--
src/common/controldata_utils.c | 30 ++
1 file changed, 30 insertions(+)