Fix second race with timeline selection during promotion read_local_xlog_page_guts has the same race as logical_read_xlog_page: RecoveryInProgress() can return true during promotion, impacting the availability of the operations doing WAL page reads with this callback.
This problem is similar to eb4e7224a1c6 that has addressed the issue for logical replication, impacting more areas of the code where this WAL page callback can be used (same narrow window during promotion, same availability issue): - pg_walinspect. - Slot advance (SQL function). - Slot creation. Repack workers (v19~) and 2PC files (since forever) can also use this callback, but they are irrelevant as far as I know. A test is added with the SQL lookup functions. This part relies on injection points, and is backpatched down to v18, like the test added for eb4e7224a1c6. This issue could probably be fixed as well in v14 and v15 for pg_walinspect. However, I also feel that there is a conservative argument about consistency here due to the support of logical decoding on standbys, so let's limit ourselves to v16 for now. pg_walinspect is used less in the field compared to the two other operations, making addressing this problem less attractive in these two older branches. Reported-by: Xuneng Zhou <[email protected]> Author: Bertrand Drouvot <[email protected]> Reviewed-by: Xuneng Zhou <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Discussion: https://postgr.es/m/7daef094-abf3-4672-bc23-3df4763b16a3%40gmail.com Backpatch-through: 16 Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/a8ee70bd5e0069c5550b0b0c5418638507fa0ed7 Modified Files -------------- src/backend/access/transam/xlogutils.c | 12 ++++++++++++ src/test/recovery/t/035_standby_logical_decoding.pl | 13 +++++++++++++ 2 files changed, 25 insertions(+)
