Re: [HACKERS] WAL Restore process during recovery

2012-01-25 Thread Fujii Masao
On Tue, Jan 24, 2012 at 6:43 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Cleaned up the points noted, new patch attached in case you wish to
 review further.

 Still has bug, so still with me to fix.

 Thanks! Will review further.

v3 patch contains lots of unrelated code changes like the following.

-   structfieldpid/structfield column of the
+   structfieldprocpid/structfield column of the

You seem to have failed to extract the patch from your repository correctly.
So I used v2 patch for the review. Sorry if I comment the things which you've
already fixed in v3 patch.

Here are the comments. They are almost not serious problems.

+/*
+ * GUC parameters
+ */
+intWalRestoreDelay = 1;

You forget to change guc.c to define wal_restore_delay as a GUC parameter?
Or just that source code comment is incorrect?


+   elog(FATAL, recovery_restore_command is too long);

Typo: s/recovery_restore_command/restore_command


+   InitLatch(walrstr-WALRestoreLatch); /* initialize latch used in main 
loop */

That latch is shared one. OwnLatch() should be called instead of InitLatch()?
If yes, it's better to call DisownLatch() when walrestore exits. Though skipping
DisownLatch() would be harmless because the latch is never owned by new
process after walrestore exits.


+   {
+   /* use volatile pointer to prevent code rearrangement */
+   volatile WalRestoreData *walrstr = WalRstr;
+
+   nextFileTli = walrstr-nextFileTli;

The declaration of walrstr is not required here because it's already done
at the head of WalRestoreNextFile().


+   if (restoredFromArchive)
+   {
+   /* use volatile pointer to prevent code rearrangement */
+   volatile WalRestoreData *walrstr = WalRstr;

Same as above.


+#define TMPRECOVERYXLOGRECOVERYXLOG

ISTM that it's better to move this definition to an include file and we should
use it in all the places where the fixed value RECOVERYXLOG is still used.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-24 Thread Fujii Masao
On Tue, Jan 24, 2012 at 12:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Jan 23, 2012 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Why does walrestore need to be invoked even when restore_command is
 not specified? It seems to be useless. We invoke walreceiver only when
 primary_conninfo is specified now. Similarly we should invoke walrestore
 only when restore_command is specified?

 walreceiver is shutdown and restarted in case of failed connection.
 That never happens with walrestore because the command is run each
 time - when we issue system(3) a new process is forked to run the
 command. So there is no specific cleanup to perform and so no reason
 for a managed cleanup process.

 So I can't see a specific reason to change that. Do you think it makes
 a difference?

Yes. When restore_command is not specified in recovery.conf, walrestore
process doesn't do any useful activity and just wastes CPU cycle. Which
might be harmless for a functionality of recovery, but ISTM it's better not
to start up walrestore in that case to avoid the waste of cycle.

 Cleaned up the points noted, new patch attached in case you wish to
 review further.

 Still has bug, so still with me to fix.

Thanks! Will review further.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-24 Thread Simon Riggs
On Tue, Jan 24, 2012 at 9:43 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jan 24, 2012 at 12:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Jan 23, 2012 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Why does walrestore need to be invoked even when restore_command is
 not specified? It seems to be useless. We invoke walreceiver only when
 primary_conninfo is specified now. Similarly we should invoke walrestore
 only when restore_command is specified?

 walreceiver is shutdown and restarted in case of failed connection.
 That never happens with walrestore because the command is run each
 time - when we issue system(3) a new process is forked to run the
 command. So there is no specific cleanup to perform and so no reason
 for a managed cleanup process.

 So I can't see a specific reason to change that. Do you think it makes
 a difference?

 Yes. When restore_command is not specified in recovery.conf, walrestore
 process doesn't do any useful activity and just wastes CPU cycle. Which
 might be harmless for a functionality of recovery, but ISTM it's better not
 to start up walrestore in that case to avoid the waste of cycle.

It just sleeps on a latch when it has nothing to do, so no wasted cycles.

Asking the postmaster seemed the easier option, I guess I could have
chosen the other way also.

I'll look at this when this is the last thing left to resolve to see
if that improves things.


 Cleaned up the points noted, new patch attached in case you wish to
 review further.

 Still has bug, so still with me to fix.

 Thanks! Will review further.

Much appreciated.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-24 Thread Fujii Masao
On Tue, Jan 24, 2012 at 6:49 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, Jan 24, 2012 at 9:43 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jan 24, 2012 at 12:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Jan 23, 2012 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Why does walrestore need to be invoked even when restore_command is
 not specified? It seems to be useless. We invoke walreceiver only when
 primary_conninfo is specified now. Similarly we should invoke walrestore
 only when restore_command is specified?

 walreceiver is shutdown and restarted in case of failed connection.
 That never happens with walrestore because the command is run each
 time - when we issue system(3) a new process is forked to run the
 command. So there is no specific cleanup to perform and so no reason
 for a managed cleanup process.

 So I can't see a specific reason to change that. Do you think it makes
 a difference?

 Yes. When restore_command is not specified in recovery.conf, walrestore
 process doesn't do any useful activity and just wastes CPU cycle. Which
 might be harmless for a functionality of recovery, but ISTM it's better not
 to start up walrestore in that case to avoid the waste of cycle.

 It just sleeps on a latch when it has nothing to do, so no wasted cycles.

Right, since walrestore process wakes up just every 10 seconds, a waste of
cycle is low. But what I feel uncomfortable is that walrestore process has
nothing to do *from start to end*, when restore_command is not specified,
but it's started up. I guess that many people would get surprised at that.
Of course, if restore_command can be changed without restarting the server,
I agree with you because walrestore process might do an useful activity
later. But currently not.

 Asking the postmaster seemed the easier option, I guess I could have
 chosen the other way also.

 I'll look at this when this is the last thing left to resolve to see
 if that improves things.

Okay.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-23 Thread Fujii Masao
On Fri, Jan 20, 2012 at 7:50 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Jan 20, 2012 at 7:38 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Fri, Jan 20, 2012 at 3:43 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Requested update

 Thanks! Will review.

In StartChildProcess(), the code which emits an error when fork of walrestore
fails is required.

In reaper(), the following comment needs to be updated because an unexpected
exit (including FATAL) is treated as a crash in the patch.

/*
 * Was it the wal restore?  If exit status is zero (normal) or 
one
 * (FATAL exit), we assume everything is all right just like 
normal
 * backends.
 */
if (pid == WalRestorePID)

Why does walrestore need to be invoked even when restore_command is
not specified? It seems to be useless. We invoke walreceiver only when
primary_conninfo is specified now. Similarly we should invoke walrestore
only when restore_command is specified?

When I set up the file-based log-shipping environment using pg_standby,
ran pgbench -i -s2, waited for walrestore to restore at least one WAL
file, and created the trigger file, then I encounterd the following error in
the standby.

sby LOG:  startup process requests 00010003 from archive
trigger file found: smart failover
sby LOG:  startup process sees last file was 00010003
sby FATAL:  could not rename file pg_xlog/RECOVERYXLOG to
pg_xlog/00010003: No such file or directory
sby LOG:  startup process (PID 11079) exited with exit code 1
sby LOG:  terminating any other active server processes

When I set up streaming replication with setting restore_command,
I got the following messages repeatedly. The WAL file name was always
.

sby1 LOG:  walrestore checking for next file to restore
sby1 LOG:  restore of  is already complete, so sleep

In PostmasterStateMachine(), the following comment needs to mention WALRestore.

 * PM_WAIT_READONLY state ends when we have no regular backends 
that
 * have been started during recovery.  We kill the startup and
 * walreceiver processes and transition to PM_WAIT_BACKENDS.  
Ideally,

In walrestore.c, the following comments seem to be incorrect. At least
an unexpected
exit of WALRestore doesn't start a recovery cycle in the patch.

 * If the WAL restore exits unexpectedly, the postmaster treats
that the same
 * as a backend crash: shared memory may be corrupted, so remaining backends
 * should be killed by SIGQUIT and then a recovery cycle started.

In walrestore.c
+ * Main entry point for walrestore process
+ *
+ * This is invoked from BootstrapMain, which has already created the basic
+ * execution environment, but not enabled signals yet.

BootstrapMain() doesn't exist, and it should be changed to
AuxiliaryProcessMain().
This is not a fault of the patch. There are the same typos in
bgwriter.c, walwriter.c
and checkpointer.c

In walrestore.c
+* SIGUSR1 is presently unused; keep it spare in case someday we want 
this
+* process to participate in ProcSignal signalling.

The above comment is incorrect because SIGUSR1 is presently used.

+   /*
+* From here on, elog(ERROR) should end with exit(1), 
not send
+* control back to the sigsetjmp block above
+*/
+   ExitOnAnyError = true;

The above is not required because sigsetjmp is not used in walrestore.c

+   /* Normal exit from the walwriter is here */
+   proc_exit(0);   /* done */

Typo: s/walwriter/walrestore

I've not reviewed the patch enough yet. Will review the patch tomorrow again.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-23 Thread Simon Riggs
On Mon, Jan 23, 2012 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote:

 I've not reviewed the patch enough yet. Will review the patch tomorrow again.

Thanks very much. I'm sure that's enough to keep me busy a few days.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-20 Thread Simon Riggs
On Fri, Jan 20, 2012 at 3:43 AM, Fujii Masao masao.fu...@gmail.com wrote:

Requested update


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ce659ec..469e6d6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include pgstat.h
 #include postmaster/bgwriter.h
 #include postmaster/startup.h
+#include postmaster/walrestore.h
 #include replication/walreceiver.h
 #include replication/walsender.h
 #include storage/bufmgr.h
@@ -187,7 +188,6 @@ static bool InArchiveRecovery = false;
 static bool restoredFromArchive = false;
 
 /* options taken from recovery.conf for archive recovery */
-static char *recoveryRestoreCommand = NULL;
 static char *recoveryEndCommand = NULL;
 static char *archiveCleanupCommand = NULL;
 static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
@@ -575,8 +575,8 @@ bool reachedConsistency = false;
 
 static bool InRedo = false;
 
-/* Have we launched bgwriter during recovery? */
-static bool bgwriterLaunched = false;
+/* Have we launched background procs during archive recovery yet? */
+static bool ArchRecoveryBgProcsActive = false;
 
 /*
  * Information logged when we detect a change in one of the parameters
@@ -632,8 +632,6 @@ static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 			 bool randAccess);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
 static void XLogFileClose(void);
-static bool RestoreArchivedFile(char *path, const char *xlogfname,
-	const char *recovername, off_t expectedSize);
 static void ExecuteRecoveryCommand(char *command, char *commandName,
 	   bool failOnerror);
 static void PreallocXlogFiles(XLogRecPtr endptr);
@@ -2706,19 +2704,47 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 
 	XLogFileName(xlogfname, tli, log, seg);
 
+#define TMPRECOVERYXLOG	RECOVERYXLOG
+
 	switch (source)
 	{
 		case XLOG_FROM_ARCHIVE:
+			/*
+			 * Check to see if the WALRestore process has already put the
+			 * next file in place while we were working. If so, use that.
+			 * If not, get it ourselves. This makes it easier to handle
+			 * initial state before the WALRestore is active, and also
+			 * handles the stop/start logic correctly when we have both
+			 * streaming and file based replication active.
+			 *
+			 * We queue up the next task for WALRestore after we've begun to
+			 * use this file later in XLogFileRead().
+			 *
+			 * If the WALRestore process is still active, the lock wait makes
+			 * us wait, which is just like we were executing the command
+			 * ourselves and so doesn't alter the logic elsewhere.
+			 */
+			if (XLogFileIsNowFullyRestored(tli, log, seg))
+			{
+snprintf(path, MAXPGPATH, XLOGDIR /%s, TMPRECOVERYXLOG);
+restoredFromArchive = true;
+break;
+			}
+
 			/* Report recovery progress in PS display */
 			snprintf(activitymsg, sizeof(activitymsg), waiting for %s,
 	 xlogfname);
 			set_ps_display(activitymsg, false);
 
 			restoredFromArchive = RestoreArchivedFile(path, xlogfname,
-	  RECOVERYXLOG,
+	  TMPRECOVERYXLOG,
 	  XLogSegSize);
+
 			if (!restoredFromArchive)
+			{
+LWLockRelease(WALRestoreCommandLock);
 return -1;
+			}
 			break;
 
 		case XLOG_FROM_PG_XLOG:
@@ -2748,18 +2774,42 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 		if (stat(xlogfpath, statbuf) == 0)
 		{
 			if (unlink(xlogfpath) != 0)
+			{
+LWLockRelease(WALRestoreCommandLock);
 ereport(FATAL,
 		(errcode_for_file_access(),
 		 errmsg(could not remove file \%s\: %m,
 xlogfpath)));
+			}
 			reload = true;
 		}
 
 		if (rename(path, xlogfpath)  0)
+		{
+			LWLockRelease(WALRestoreCommandLock);
 			ereport(ERROR,
 (errcode_for_file_access(),
  errmsg(could not rename file \%s\ to \%s\: %m,
 		path, xlogfpath)));
+		}
+
+		/*
+		 * Make sure we recover from the new filename, so we can reuse the
+		 * temporary filename for asynchronous restore actions.
+		 */
+		strcpy(path, xlogfpath);
+
+		/*
+		 * Tell the WALRestore process to get the next file now.
+		 * Hopefully it will be ready for use in time for the next call the
+		 * Startup process makes to XLogFileRead().
+		 *
+		 * It might seem like we should do that earlier but then there is a
+		 * race condition that might lead to replacing RECOVERYXLOG with
+		 * another file before we've copied it.
+		 */
+		SetNextWALRestoreLogSeg(tli, log, seg);
+		LWLockRelease(WALRestoreCommandLock);
 
 		/*
 		 * If the existing segment was replaced, since walsenders might have
@@ -2911,8 +2961,11 @@ XLogFileClose(void)
  * For fixed-size files, the caller may pass the expected size as an
  * additional crosscheck on successful recovery.  If the file size is not
  * known, set expectedSize = 0.
+ *
+ * Must be called with 

Re: [HACKERS] WAL Restore process during recovery

2012-01-20 Thread Fujii Masao
On Fri, Jan 20, 2012 at 7:38 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Fri, Jan 20, 2012 at 3:43 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Requested update

Thanks! Will review.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-19 Thread Simon Riggs
On Tue, Jan 17, 2012 at 6:52 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Jan 16, 2012 at 2:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 WALRestore process asynchronously executes restore_command while
 recovery continues working.

 Overlaps downloading of next WAL file to reduce time delays in file
 based archive recovery.

 Handles cases of file-only and streaming/file correctly.

 Though I've not reviewed the patch deeply yet, I observed the following
 two problems when I tested the patch.

 When I set up streaming replication + archive (i.e., restore_command is set)
 and started the standby, I got the following error:

    FATAL:  all AuxiliaryProcs are in use
    LOG:  walrestore process (PID 18839) exited with exit code 1

Fixed and better documented.

 When I started an archive recovery without setting restore_command,
 it successfully finished.

Not sure exactly what you mean, but I fixed a bug that might be
something you're seeing.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ce659ec..469e6d6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include pgstat.h
 #include postmaster/bgwriter.h
 #include postmaster/startup.h
+#include postmaster/walrestore.h
 #include replication/walreceiver.h
 #include replication/walsender.h
 #include storage/bufmgr.h
@@ -187,7 +188,6 @@ static bool InArchiveRecovery = false;
 static bool restoredFromArchive = false;
 
 /* options taken from recovery.conf for archive recovery */
-static char *recoveryRestoreCommand = NULL;
 static char *recoveryEndCommand = NULL;
 static char *archiveCleanupCommand = NULL;
 static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
@@ -575,8 +575,8 @@ bool reachedConsistency = false;
 
 static bool InRedo = false;
 
-/* Have we launched bgwriter during recovery? */
-static bool bgwriterLaunched = false;
+/* Have we launched background procs during archive recovery yet? */
+static bool ArchRecoveryBgProcsActive = false;
 
 /*
  * Information logged when we detect a change in one of the parameters
@@ -632,8 +632,6 @@ static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 			 bool randAccess);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
 static void XLogFileClose(void);
-static bool RestoreArchivedFile(char *path, const char *xlogfname,
-	const char *recovername, off_t expectedSize);
 static void ExecuteRecoveryCommand(char *command, char *commandName,
 	   bool failOnerror);
 static void PreallocXlogFiles(XLogRecPtr endptr);
@@ -2706,19 +2704,47 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 
 	XLogFileName(xlogfname, tli, log, seg);
 
+#define TMPRECOVERYXLOG	RECOVERYXLOG
+
 	switch (source)
 	{
 		case XLOG_FROM_ARCHIVE:
+			/*
+			 * Check to see if the WALRestore process has already put the
+			 * next file in place while we were working. If so, use that.
+			 * If not, get it ourselves. This makes it easier to handle
+			 * initial state before the WALRestore is active, and also
+			 * handles the stop/start logic correctly when we have both
+			 * streaming and file based replication active.
+			 *
+			 * We queue up the next task for WALRestore after we've begun to
+			 * use this file later in XLogFileRead().
+			 *
+			 * If the WALRestore process is still active, the lock wait makes
+			 * us wait, which is just like we were executing the command
+			 * ourselves and so doesn't alter the logic elsewhere.
+			 */
+			if (XLogFileIsNowFullyRestored(tli, log, seg))
+			{
+snprintf(path, MAXPGPATH, XLOGDIR /%s, TMPRECOVERYXLOG);
+restoredFromArchive = true;
+break;
+			}
+
 			/* Report recovery progress in PS display */
 			snprintf(activitymsg, sizeof(activitymsg), waiting for %s,
 	 xlogfname);
 			set_ps_display(activitymsg, false);
 
 			restoredFromArchive = RestoreArchivedFile(path, xlogfname,
-	  RECOVERYXLOG,
+	  TMPRECOVERYXLOG,
 	  XLogSegSize);
+
 			if (!restoredFromArchive)
+			{
+LWLockRelease(WALRestoreCommandLock);
 return -1;
+			}
 			break;
 
 		case XLOG_FROM_PG_XLOG:
@@ -2748,18 +2774,42 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 		if (stat(xlogfpath, statbuf) == 0)
 		{
 			if (unlink(xlogfpath) != 0)
+			{
+LWLockRelease(WALRestoreCommandLock);
 ereport(FATAL,
 		(errcode_for_file_access(),
 		 errmsg(could not remove file \%s\: %m,
 xlogfpath)));
+			}
 			reload = true;
 		}
 
 		if (rename(path, xlogfpath)  0)
+		{
+			LWLockRelease(WALRestoreCommandLock);
 			ereport(ERROR,
 (errcode_for_file_access(),
  errmsg(could not rename file \%s\ to \%s\: %m,
 		path, xlogfpath)));
+		}
+
+		/*
+		 * Make sure we recover from the new filename, so we can reuse the
+		 * temporary 

Re: [HACKERS] WAL Restore process during recovery

2012-01-19 Thread Fujii Masao
On Fri, Jan 20, 2012 at 4:17 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, Jan 17, 2012 at 6:52 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Jan 16, 2012 at 2:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 WALRestore process asynchronously executes restore_command while
 recovery continues working.

 Overlaps downloading of next WAL file to reduce time delays in file
 based archive recovery.

 Handles cases of file-only and streaming/file correctly.

 Though I've not reviewed the patch deeply yet, I observed the following
 two problems when I tested the patch.

 When I set up streaming replication + archive (i.e., restore_command is set)
 and started the standby, I got the following error:

    FATAL:  all AuxiliaryProcs are in use
    LOG:  walrestore process (PID 18839) exited with exit code 1

 Fixed and better documented.

 When I started an archive recovery without setting restore_command,
 it successfully finished.

 Not sure exactly what you mean, but I fixed a bug that might be
 something you're seeing.

Thanks!

But you forgot to include walrestore.c and .h in the patch. Can you submit
the updated version of the patch?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-17 Thread Simon Riggs
On Tue, Jan 17, 2012 at 6:52 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Jan 16, 2012 at 2:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 WALRestore process asynchronously executes restore_command while
 recovery continues working.

 Overlaps downloading of next WAL file to reduce time delays in file
 based archive recovery.

 Handles cases of file-only and streaming/file correctly.

 Though I've not reviewed the patch deeply yet, I observed the following
 two problems when I tested the patch.

 When I set up streaming replication + archive (i.e., restore_command is set)
 and started the standby, I got the following error:

    FATAL:  all AuxiliaryProcs are in use
    LOG:  walrestore process (PID 18839) exited with exit code 1

 When I started an archive recovery without setting restore_command,
 it successfully finished.

Oh, I did have NUM_AUXILIARY_PROCS increased at one point, but I
realised it wasn't needed and removed it. Will change that. Thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Restore process during recovery

2012-01-16 Thread Fujii Masao
On Mon, Jan 16, 2012 at 2:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 WALRestore process asynchronously executes restore_command while
 recovery continues working.

 Overlaps downloading of next WAL file to reduce time delays in file
 based archive recovery.

 Handles cases of file-only and streaming/file correctly.

Though I've not reviewed the patch deeply yet, I observed the following
two problems when I tested the patch.

When I set up streaming replication + archive (i.e., restore_command is set)
and started the standby, I got the following error:

FATAL:  all AuxiliaryProcs are in use
LOG:  walrestore process (PID 18839) exited with exit code 1

When I started an archive recovery without setting restore_command,
it successfully finished.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WAL Restore process during recovery

2012-01-15 Thread Simon Riggs
WALRestore process asynchronously executes restore_command while
recovery continues working.

Overlaps downloading of next WAL file to reduce time delays in file
based archive recovery.

Handles cases of file-only and streaming/file correctly.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ce659ec..e8b0b69 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include pgstat.h
 #include postmaster/bgwriter.h
 #include postmaster/startup.h
+#include postmaster/walrestore.h
 #include replication/walreceiver.h
 #include replication/walsender.h
 #include storage/bufmgr.h
@@ -187,7 +188,6 @@ static bool InArchiveRecovery = false;
 static bool restoredFromArchive = false;
 
 /* options taken from recovery.conf for archive recovery */
-static char *recoveryRestoreCommand = NULL;
 static char *recoveryEndCommand = NULL;
 static char *archiveCleanupCommand = NULL;
 static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
@@ -575,8 +575,8 @@ bool reachedConsistency = false;
 
 static bool InRedo = false;
 
-/* Have we launched bgwriter during recovery? */
-static bool bgwriterLaunched = false;
+/* Have we launched background procs during archive recovery yet? */
+static bool ArchRecoveryBgProcsActive = false;
 
 /*
  * Information logged when we detect a change in one of the parameters
@@ -632,8 +632,6 @@ static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 			 bool randAccess);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
 static void XLogFileClose(void);
-static bool RestoreArchivedFile(char *path, const char *xlogfname,
-	const char *recovername, off_t expectedSize);
 static void ExecuteRecoveryCommand(char *command, char *commandName,
 	   bool failOnerror);
 static void PreallocXlogFiles(XLogRecPtr endptr);
@@ -2706,19 +2704,47 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 
 	XLogFileName(xlogfname, tli, log, seg);
 
+#define TMPRECOVERYXLOG	RECOVERYXLOG
+
 	switch (source)
 	{
 		case XLOG_FROM_ARCHIVE:
+			/*
+			 * Check to see if the WALRestore process has already put the
+			 * next file in place while we were working. If so, use that.
+			 * If not, get it ourselves. This makes it easier to handle
+			 * initial state before the WALRestore is active, and also
+			 * handles the stop/start logic correctly when we have both
+			 * streaming and file based replication active.
+			 *
+			 * We queue up the next task for WALRestore after we've begun to
+			 * use this file later in XLogFileRead().
+			 *
+			 * If the WALRestore process is still active, the lock wait makes
+			 * us wait, which is just like we were executing the command
+			 * ourselves and so doesn't alter the logic elsewhere.
+			 */
+			if (XLogFileIsNowFullyRestored(tli, log, seg))
+			{
+snprintf(path, MAXPGPATH, XLOGDIR /%s, TMPRECOVERYXLOG);
+restoredFromArchive = true;
+break;
+			}
+
 			/* Report recovery progress in PS display */
 			snprintf(activitymsg, sizeof(activitymsg), waiting for %s,
 	 xlogfname);
 			set_ps_display(activitymsg, false);
 
 			restoredFromArchive = RestoreArchivedFile(path, xlogfname,
-	  RECOVERYXLOG,
+	  TMPRECOVERYXLOG,
 	  XLogSegSize);
+
 			if (!restoredFromArchive)
+			{
+LWLockRelease(WALRestoreCommandLock);
 return -1;
+			}
 			break;
 
 		case XLOG_FROM_PG_XLOG:
@@ -2748,18 +2774,42 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 		if (stat(xlogfpath, statbuf) == 0)
 		{
 			if (unlink(xlogfpath) != 0)
+			{
+LWLockRelease(WALRestoreCommandLock);
 ereport(FATAL,
 		(errcode_for_file_access(),
 		 errmsg(could not remove file \%s\: %m,
 xlogfpath)));
+			}
 			reload = true;
 		}
 
 		if (rename(path, xlogfpath)  0)
+		{
+			LWLockRelease(WALRestoreCommandLock);
 			ereport(ERROR,
 (errcode_for_file_access(),
  errmsg(could not rename file \%s\ to \%s\: %m,
 		path, xlogfpath)));
+		}
+
+		/*
+		 * Make sure we recover from the new filename, so we can reuse the
+		 * temporary filename for asynchronous restore actions.
+		 */
+		strcpy(path, xlogfpath);
+
+		/*
+		 * Tell the WALRestore process to get the next file now.
+		 * Hopefully it will be ready for use in time for the next call the
+		 * Startup process makes to XLogFileRead().
+		 *
+		 * It might seem like we should do that earlier but then there is a
+		 * race condition that might lead to replacing RECOVERYXLOG with
+		 * another file before we've copied it.
+		 */
+		SetNextWALRestoreLogSeg(tli, log, seg);
+		LWLockRelease(WALRestoreCommandLock);
 
 		/*
 		 * If the existing segment was replaced, since walsenders might have
@@ -2911,8 +2961,11 @@ XLogFileClose(void)
  * For fixed-size files, the caller may pass the expected