Hello!

I want to add one slightly related issue to the thread, attached as
0002. The patches are independent, one can be applied without the
other. The only relation is that we found 0002 because of the changes
in 0001. After reviewing the details, however, I believe 0002 isn't a
regression caused by 0001, but rather a previously hidden bug that
0001 exposed.

Sometimes pg_rewind can generate a state where the stated
minRecoveryPoint is beyond the actual available wal. In the original
recovery checks, postgres simply continued on this point. With the
modified 0001 version, it detects that we didn't reach the stated
minRecoveryPoint and reports an error.

Regardless of 0001, pg_rewind shouldn't result in inconsistent output
like that. The attached 0002 aims to solve this by capturing the
expected minRecoveryPoint earlier, before traversing the wal files.
The previous, opposite order could add new segment files between
retrieving the file list and querying the LSN, which results in
missing WAL data in the output.

Attachment: 0002-pg_rewind-fix-remote-source-WAL-race-condition.patch
Description: Binary data

Attachment: 0001-Enforce-minRecoveryPoint-check-regardless-of-archive.patch
Description: Binary data

Reply via email to