On 08.02.2011 06:27, Robert Haas wrote:
On Mon, Jan 24, 2011 at 2:00 AM, Fujii Masao<masao.fu...@gmail.com>  wrote:
On Wed, Jan 5, 2011 at 5:08 AM, Heikki Linnakangas
<heikki.linnakan...@enterprisedb.com>  wrote:
I finally got around to look at this. I wrote a patch to validate that the
TLI on xlog page header matches ThisTimeLineID during recovery, and noticed
quickly in testing that it doesn't catch all the cases I'd like to catch
:-(.

The patch added into the CF hasn't solved this problem yet. Are you planning
to solve it in 9.1? Or are you planning to just commit the patch for 9.1, and
postpone the issue to 9.2 or later? I'm OK either way. Of course, the former
is quite better, though.

Anyway, you have to add the documentation about this feature.

This patch is erroneously marked Needs Review in the CommitFest
application, but I think really it's Waiting on Author, and has been
for a long time.  I'm thinking we should push this out to 9.2.

I dropped the ball on this one, but now that we have pg_basebackup and "pg_ctl promote" which make it easy to set up a standby and failover, I think we should still do this in 9.1. Otherwise you need a restart to have a 2nd standby server track the TLI change that failover causes.

I wanted to add those extra safeguards, and to support streaming replication in addition to restoring from archive, but that's 9.2 material. However, the original patch (http://archives.postgresql.org/message-id/4cc83a50.7070...@enterprisedb.com) was non-intrusive and no-one objected. While the extra safeguards would've been nice, this patch doesn't make the situation any worse than it is already when you restart the standby.

Here's an updated version of that patch, now with a little bit of documentation. Barring objections, I'll commit this.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index e30552f..3c98ae6 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -660,7 +660,10 @@ protocol to make nodes agree on a serializable transactional order.
     command file <filename>recovery.conf</> in the standby's cluster data
     directory, and turn on <varname>standby_mode</>. Set
     <varname>restore_command</> to a simple command to copy files from
-    the WAL archive.
+    the WAL archive. If you plan to have multiple standby servers for high
+    availability purposes, set <varname>recovery_target_timeline</> to
+    <literal>latest</>, to make the standby server follow the timeline change
+    that occurs at failover to another standby.
    </para>
 
    <note>
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 602fbe2..e9e95ac 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -240,7 +240,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        <para>
         Specifies recovering into a particular timeline.  The default is
         to recover along the same timeline that was current when the
-        base backup was taken.  You only need to set this parameter
+        base backup was taken. Setting this to <literal>latest</> recovers
+        to the latest timeline found in the archive, which is useful in
+        a standby server. Other than that you only need to set this parameter
         in complex re-recovery situations, where you need to return to
         a state that itself was reached after a point-in-time recovery.
         See <xref linkend="backup-timelines"> for discussion.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b4eb4ac..d1f69cf 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -214,6 +214,8 @@ static bool recoveryStopAfter;
  *
  * recoveryTargetTLI: the desired timeline that we want to end in.
  *
+ * recoveryTargetIsLatest: was the requested target timeline 'latest'
+ *
  * expectedTLIs: an integer list of recoveryTargetTLI and the TLIs of
  * its known parents, newest first (so recoveryTargetTLI is always the
  * first list member).	Only these TLIs are expected to be seen in the WAL
@@ -227,6 +229,7 @@ static bool recoveryStopAfter;
  * to decrease.
  */
 static TimeLineID recoveryTargetTLI;
+static bool recoveryTargetIsLatest = false;
 static List *expectedTLIs;
 static TimeLineID curFileTLI;
 
@@ -637,6 +640,7 @@ static bool ValidXLOGHeader(XLogPageHeader hdr, int emode);
 static XLogRecord *ReadCheckpointRecord(XLogRecPtr RecPtr, int whichChkpt);
 static List *readTimeLineHistory(TimeLineID targetTLI);
 static bool existsTimeLineHistory(TimeLineID probeTLI);
+static bool rescanLatestTimeLine(void);
 static TimeLineID findNewestTimeLine(TimeLineID startTLI);
 static void writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 					 TimeLineID endTLI,
@@ -4254,6 +4258,61 @@ existsTimeLineHistory(TimeLineID probeTLI)
 }
 
 /*
+ * Scan for new timelines that might have appeared in the archive since we
+ * started recovery.
+ *
+ * If there is any, the function changes recovery target TLI to the latest
+ * one and returns 'true'.
+ */
+static bool
+rescanLatestTimeLine(void)
+{
+	TimeLineID newtarget;
+	newtarget = findNewestTimeLine(recoveryTargetTLI);
+	if (newtarget != recoveryTargetTLI)
+	{
+		/*
+		 * Determine the list of expected TLIs for the new TLI
+		 */
+		List *newExpectedTLIs;
+		newExpectedTLIs = readTimeLineHistory(newtarget);
+
+		/*
+		 * If the current timeline is not part of the history of the
+		 * new timeline, we cannot proceed to it.
+		 *
+		 * XXX This isn't foolproof: The new timeline might have forked from
+		 * the current one, but before the current recovery location. In that
+		 * case we will still switch to the new timeline and proceed replaying
+		 * from it even though the history doesn't match what we already
+		 * replayed. That's not good. We will likely notice at the next online
+		 * checkpoint, as the TLI won't match what we expected, but it's
+		 * not guaranteed. The admin needs to make sure that doesn't happen.
+		 */
+		if (!list_member_int(expectedTLIs,
+							 (int) recoveryTargetTLI))
+			ereport(LOG,
+					(errmsg("new timeline %u is not a child of database system timeline %u",
+							newtarget,
+							ThisTimeLineID)));
+		else
+		{
+			/* Switch target */
+			recoveryTargetTLI = newtarget;
+			expectedTLIs = newExpectedTLIs;
+
+			XLogCtl->RecoveryTargetTLI = recoveryTargetTLI;
+
+			ereport(LOG,
+					(errmsg("new target timeline is %u",
+							recoveryTargetTLI)));
+			return true;
+		}
+	}
+	return false;
+}
+
+/*
  * Find the newest existing timeline, assuming that startTLI exists.
  *
  * Note: while this is somewhat heuristic, it does positively guarantee
@@ -5327,11 +5386,13 @@ readRecoveryCommandFile(void)
 						(errmsg("recovery target timeline %u does not exist",
 								rtli)));
 			recoveryTargetTLI = rtli;
+			recoveryTargetIsLatest = false;
 		}
 		else
 		{
 			/* We start the "latest" search from pg_control's timeline */
 			recoveryTargetTLI = findNewestTimeLine(recoveryTargetTLI);
+			recoveryTargetIsLatest = true;
 		}
 	}
 
@@ -10032,13 +10093,24 @@ retry:
 					{
 						/*
 						 * We've exhausted all options for retrieving the
-						 * file. Retry ...
+						 * file. Retry.
 						 */
 						failedSources = 0;
 
 						/*
-						 * ... but sleep first if it hasn't been long since
-						 * last attempt.
+						 * Before we sleep, re-scan for possible new timelines
+						 * if we were requested to recover to the latest
+						 * timeline.
+						 */
+						if (recoveryTargetIsLatest)
+						{
+							if (rescanLatestTimeLine())
+								continue;
+						}
+
+						/*
+						 * If it hasn't been long since last attempt, sleep
+						 * to avoid busy-waiting.
 						 */
 						now = (pg_time_t) time(NULL);
 						if ((now - last_fail_time) < 5)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to