On 7/23/17 11:48 PM, Masahiko Sawada wrote:
On Sat, Jul 22, 2017 at 8:04 AM, Stephen Frost <sfr...@snowman.net> wrote:

I started discussing this with David off-list and he'd like a chance to
review it in a bit more detail (he's just returning from being gone for
a few weeks).  That review will be posted to this thread on Monday, and
I'll wait until then to move forward with the patch.

Before reviewing the patch, I would note that it looks like this issue was introduced in b6a323a8c before the 9.6 release. The documentation states that the pg_stop_backup() function will wait for all required WAL to be archived, which corresponds to the default in the new patch of waitforarchive = true. The commit notes that the documentation needs updating, but since that didn't happen I think we should consider this a bug in 9.6 and back patch.

While this patch brings pg_stop_backup() in line with the documentation, it also introduces a behavioral change compared to 9.6. Currently, the default 9.6 behavior on a standby is to return immediately from pg_stop_backup(), but this patch will cause it to block by default instead. Since action on the master may be required to unblock the process, I see this as a pretty significant change. I still think we should fix and backpatch, but I wanted to point this out.

The patch looks sensible to me.  A few comments:

1) I would change:

"Check if the WAL segment needed for this backup have been completed, in which case a forced segment switch may be needed on the primary."

To something like:

"If the WAL segments required for this backup have not been archived then it might be necessary to force a segment switch on the primary."

2) At backup.sgml L1015 it says that pg_stop_backup() will automatically switch the WAL segment. There should be a caveat here for standby backups, like:

When called on a master it terminates the backup mode and performs an automatic switch to the next WAL segment. The reason for the switch is to arrange for the last WAL segment written during the backup interval to be ready to archive. When called on a standby the WAL segment switch must be performed manually on the master if it does not happen due to normal write activity.

3) The fact that this fix requires "archive_mode = always" seems like a pretty big caveat, thought I don't have any ideas on how to make it better without big code changes. Maybe it would be help to change:

+ the backup is taken on a standby, <function>pg_stop_backup</> waits
+ for WAL to be archived when <varname>archive_mode</>

To:

+ the backup is taken on a standby, <function>pg_stop_backup</> waits
+ for WAL to be archived *only* when <varname>archive_mode</>

Perhaps this should be noted in the pg_basebackup docs as well?

Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to