pg_rewind is not crash safe

Heikki Linnakangas Wed, 05 Aug 2020 11:14:08 -0700

A colleague of mine brought to my attention that pg_rewind is not crashsafe. If it is interrupted for any reason, it leaves behind a datadirectory with a mix of data from the source and target images. Ifyou're "lucky", the server will start up, but it can be in aninconsistent state. That's obviously not good. It would be nice to:


1. Detect the situation, and refuse to start up.


Or even better:

2. Make pg_rewind crash safe, so that you could safely restart it ifit's interrupted.


Has anyone else run into this? How did you work around it?

It doesn't seem hard to detect this. pg_rewind can somehow "poison" thedata directory just before it starts making irreversible changes. I'mthinking of updating the 'state' in the control file to a newPG_IN_REWIND value.

It also doesn't seem too hard to make it restartable. As long as youpoint it to the same source server, it is already almost safe to runpg_rewind again. If we re-order the way it writes the control or backupfiles and makes other changes, pg_rewind can verify that you pointed itat the same or compatible primary as before.

I think there's one corner case with truncated files, if pg_rewind hasextended a file by copying missing "tail" from the source system, butthe system crashes before it's fsynced to disk. But I think we can fixthat too, by paying attention to SMGR_TRUNCATE records when scanning thesource WAL.


- Heikki

pg_rewind is not crash safe

Reply via email to