On 07/17/2015 06:28 AM, Michael Paquier wrote:
On Wed, Jul 1, 2015 at 9:31 PM, Fujii Masao <masao.fu...@gmail.com> wrote:
On Wed, Jul 1, 2015 at 2:21 AM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
On 06/29/2015 09:44 AM, Michael Paquier wrote:

On Mon, Jun 29, 2015 at 4:55 AM, Heikki Linnakangas wrote:

But we'll still need to handle the pg_xlog symlink case somehow. Perhaps
it
would be enough to special-case pg_xlog for now.


Well, sure, pg_rewind does not copy the soft links either way. Now it
would be nice to have an option to be able to recreate the soft link
of at least pg_xlog even if it can be scripted as well after a run.

Hmm. I'm starting to think that pg_rewind should ignore pg_xlog entirely. In
any non-trivial scenarios, just copying all the files from pg_xlog isn't
enough anyway, and you need to set up a recovery.conf after running
pg_rewind that contains a restore_command or primary_conninfo, to fetch the
WAL. So you can argue that by not copying pg_xlog automatically, we're
actually doing a favour to the DBA, by forcing him to set up the
recovery.conf file correctly. Because if you just test simple scenarios
where not much time has passed between the failover and running pg_rewind,
it might be enough to just copy all the WAL currently in pg_xlog, but it
would not be enough if more time had passed and not all the required WAL is
present in pg_xlog anymore.  And by not copying the WAL, we can avoid some
copying, as restore_command or streaming replication will only copy what's
needed, while pg_rewind would copy all WAL it can find the target's data
directory.

pg_basebackup also doesn't include any WAL, unless you pass the --xlog
option. It would be nice to also add an optional --xlog option to pg_rewind,
but with pg_rewind it's possible that all the required WAL isn't present in
the pg_xlog directory anymore, so you wouldn't always achieve the same
effect of making the backup self-contained.

So, I propose the attached. It makes pg_rewind ignore the pg_xlog directory
in both the source and the target.

If pg_xlog is simply ignored, some old WAL files may remain in target server.
Don't these old files cause the subsequent startup of target server as new
standby to fail? That is, it's the case where the WAL file with the same name
but different content exist both in target and source. If that's harmfull,
pg_rewind also should remove the files in pg_xlog of target server.

This would reduce usability. The rewound node will replay WAL from the
previous checkpoint where WAL forked up to the minimum recovery point
of source node where pg_rewind has been run. Hence if we remove
completely the contents of pg_xlog we'd lose a portion of the logs
that need to be replayed until timeline is switched on the rewound
node when recovering it (while streaming from the promoted standby,
whatever). I don't really see why recycled segments would be a
problem, as that's perhaps what you are referring to, but perhaps I am
missing something.

Hmm. My thinking was that you need to set up restore_command or primary_conninfo anyway, to fetch the old WAL, so there's no need to copy any WAL. But there's a problem with that: you might have WAL files in the source server that haven't been archived yet, and you need them to recover the rewound target node. That's OK for libpq mode, I think as the server is still running and presumably and you can fetch the WAL with streaming replication, but for copy-mode, that's not a good assumption. You might be relying on a WAL archive, and the file might not be archived yet.

Perhaps it's best if we copy all the WAL files from source in copy-mode, but not in libpq mode. Regarding old WAL files in the target, it's probably best to always leave them alone. They should do no harm, and as a general principle it's best to avoid destroying evidence.

It'd be nice to get some fix for this for alpha2, so I'll commit a fix to do that on Monday, unless we come to a different conclusion before that.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to