[HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

Alexey Vasiliev Mon, 03 Nov 2014 03:05:15 -0800

Hello everyone.

*  Project name:  Add recovery_timeout option to control timeout of 
restore_command nonzero status code
*  Uniquely identifiable file name, so we can tell difference between your v1 
and v24:  0001-add-recovery_timeout-to-controll-timeout-between-res.patch
*  What the patch does in a short paragraph: This patch should add option 
recovery_timeout, which help to control timeout of restore_command nonzero 
status code. Right now default value is 5 seconds. This is useful, if I using 
for restore of wal logs some external storage (like AWS S3) and no matter what 
the slave database will lag behind the master. The problem, what for each 
request to AWS S3 need to pay, what is why for N nodes, which try to get next 
wal log each 5 seconds will be bigger price, than for example each 30 seconds. 
Before I do this in this way: " if ! (/usr/local/bin/envdir /etc/wal-e.d/env 
/usr/local/bin/wal-e wal-fetch "%f" "%p"); then sleep 60; fi ". But in this 
case restart/stop database slower.
*  Whether the patch is for discussion or for application: No such thing.
*  Which branch the patch is against: master branch
*  Whether it compiles and tests successfully, so we know nothing obvious is 
broken: compiled and pass tests on local mashine.
*  Whether it contains any platform-specific items and if so, has it been 
tested on other platforms: hope, no.
*  Confirm that the patch includes regression tests to check the new feature 
actually works as described: No it doesn't have test. I don't found ho to 
testing new config variables.
*  Include documentation: added.
*  Describe the effect your patch has on performance, if any: shouldn't effect 
on database performance.
This is my first patch. I am not sure about name of option. Maybe it should 
called "recovery_nonzero_timeout".


-- 
Alexey Vasiliev

From 35abe56b2497f238a6888fe98c54aa9cb5300866 Mon Sep 17 00:00:00 2001
From: Alexey Vasiliev <leopard.not.a@gmail.com>
Date: Mon, 3 Nov 2014 00:21:14 +0200
Subject: [PATCH] add recovery_timeout to controll timeout between
 restore_command nonzero

---
 doc/src/sgml/recovery-config.sgml               | 16 ++++++++++++++++
 src/backend/access/transam/recovery.conf.sample |  5 +++++
 src/backend/access/transam/xlog.c               | 17 ++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0f1ff34..bc410b3 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,22 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="restore-timeout" xreflabel="restore_timeout">
+      <term><varname>restore_timeout</varname> (<type>integer</type>)
+      <indexterm>
+        <primary><varname>restore_timeout</> recovery parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        By default, if <varname>restore_command</> return nonzero status,
+        server will retry command again after 5 seconds. This parameter
+        allow to change this time. This parameter is optional. This can
+        be useful to increase/decrease number of a restore_command calls.
+       </para>
+      </listitem>
+     </varlistentry>
+
     </variablelist>
 
   </sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 7657df3..282e898 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
 #
 #recovery_end_command = ''
 #
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_timeout = ''
+#
 #---------------------------------------------------------------------------
 # RECOVERY TARGET PARAMETERS
 #---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3c9aeae..98f0fca 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -233,6 +233,7 @@ static TimestampTz recoveryTargetTime;
 static char *recoveryTargetName;
 static int	recovery_min_apply_delay = 0;
 static TimestampTz recoveryDelayUntilTime;
+static int 	restore_timeout = 0;
 
 /* options taken from recovery.conf for XLOG streaming */
 static bool StandbyModeRequested = false;
@@ -5245,6 +5246,20 @@ readRecoveryCommandFile(void)
 					(errmsg_internal("trigger_file = '%s'",
 									 TriggerFile)));
 		}
+		else if (strcmp(item->name, "restore_timeout") == 0)
+		{
+			const char *hintmsg;
+
+			if (!parse_int(item->value, &restore_timeout, GUC_UNIT_MS,
+						   &hintmsg))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("parameter \"%s\" requires a temporal value",
+								"restore_timeout"),
+						 hintmsg ? errhint("%s", _(hintmsg)) : 0));
+			ereport(DEBUG2,
+					(errmsg_internal("restore_timeout = '%s'", item->value)));
+		}
 		else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
 		{
 			const char *hintmsg;
@@ -11110,7 +11125,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					WaitLatch(&XLogCtl->recoveryWakeupLatch,
 							  WL_LATCH_SET | WL_TIMEOUT,
-							  5000L);
+							  restore_timeout > 0 ? restore_timeout : 5000L);
 					ResetLatch(&XLogCtl->recoveryWakeupLatch);
 					break;
 				}
-- 
1.9.3 (Apple Git-50)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

Reply via email to