On 25.10.2011 08:12, Fujii Masao wrote:
On Tue, Oct 25, 2011 at 12:24 AM, Heikki Linnakangas
<heikki.linnakan...@enterprisedb.com>  wrote:
On 24.10.2011 15:29, Fujii Masao wrote:

+<listitem>
+<para>
+      Copy the pg_control file from the cluster directory to the global
+      sub-directory of the backup. For example:
+<programlisting>
+ cp $PGDATA/global/pg_control /mnt/server/backupdir/global
+</programlisting>
+</para>
+</listitem>

Why is this step required? The control file is overwritten by information
from the backup_label anyway, no?

Yes, when recovery starts, the control file is overwritten. But before that,
we retrieve the minimum recovery point from the control file. Then it's used
as the backup end location.

During recovery, pg_stop_backup() cannot write an end-of-backup record.
So, in standby-only backup, other way to retrieve the backup end location
(instead of an end-of-backup record) is required. Ishiduka-san used the
control file as that, according to your suggestion ;)
http://archives.postgresql.org/pgsql-hackers/2011-05/msg01405.php

Oh :-)

+<para>
+      Again connect to the database as a superuser, and execute
+<function>pg_stop_backup</>. This terminates the backup mode, but
does not
+      perform a switch to the next WAL segment, create a backup history
file and
+      wait for all required WAL segments to be archived,
+      unlike that during normal processing.
+</para>
+</listitem>

How do you ensure that all the required WAL segments have been archived,
then?

The patch doesn't provide any capability to ensure that, IOW assumes that's
a user responsibility. If a user wants to ensure that, he/she needs to calculate
the backup start and end WAL files from the result of pg_start_backup()
and pg_stop_backup() respectively, and needs to wait until those files have
appeared in the archive. Also if the required WAL file has not been archived
yet, a user might need to execute pg_switch_xlog() in the master.

Frankly, I think this whole thing is too fragile. The procedure is superficially similar to what you do on master: run pg_start_backup(), rsync data directory, run pg_stop_backup(), but is actually subtly different and more complicated. If you don't know that, and don't follow the full procedure, you get a corrupt backup. And the backup might look ok, and might even sometimes work, which means that you won't notice in quick testing. That's a *huge* foot-gun.

I think we need to step back and find a way to make this:
a) less complicated, or at least
b) more robust, so that if you don't follow the procedure, you get an error.

With pg_basebackup, we have a fighting chance of getting this right, because we have more control over how the backup is made. For example, we can co-operate with the buffer manager to avoid torn-pages, eliminating the need for full_page_writes=on, and we can include a control file with the correct end-of-backup location automatically, without requiring user intervention. pg_basebackup is less flexible than the pg_start/stop_backup method, and unfortunately you're more likely to need the flexibility in a more complicated setup with a hot standby server and all, but making the generic pg_start/stop_backup method work seems infeasible at the moment.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to