>> On Thu, 2026-01-15 at 14:14 +0800, [email protected] wrote:
>> Now if we run pg_rewind on server A, it examines the local WAL to find all 
>> the blocks
>> that were modified after the last common checkpoint (which happened in step 
>> 3 above).
>> If neither wal_log_hints = on nor checksums are enabled (which effectively 
>> forces
>> WAL-logging hint bit changes), there is no track of step 5 in the WAL, and 
>> pg_rewind
>> fails to copy that block from server B.  The consequence is that after 
>> pg_rewind, the
>> row is *still* visible on server A because of the hint bits.  That is data 
>> corruption.
>> Therefore, the requirement cannot be relaxed.



>Currently pg_rewind search wal start at checkpoint lsn or redo lsn, I mean to 
>search more
>wal to cover whole releated transactions so any releated pages with copyed, 
>and we never
>warried about hint bits issue.





Base on the discussion I write a patch and introduce it:

Currently pg_rewind search checkpoint start at divergerec and walk backward. 
Then it
collect change pages from checkpoint to divergerec forward.

We modify the second step and collect the minimal commited transaction id and 
named
min_commited_xid. And collect the 'first appeared' transaction id by 
XLOG_RUNNING_XACTS
wal record and named base_xid. If base_xid <= min_commited_xid we can work a 
safy
rewind.

How ever if we can not met 'base_xid <= min_commited_xid' then we read wal from
checkpoint and walk backward until we met the goal, ofcause we collect change 
pages during
the third step. If we can not met the goal at last, we report an error for can 
not finish.


The third step maybe slowly so I add a option(-d or --deep-dig), by default it 
stop if can not
met the goal at the second step. And user should add -d to run the third step.

Patch attached.


----
Best Regards,
Movead Li



 



 



 



 



 



 


Attachment: 0001-Enable-pg_rewind-without-page-consistence.patch
Description: Binary data

Reply via email to