Hi,
first, thanks for the review. Comments are below.
2012-09-20 12:30 keltezéssel, Amit Kapila írta:
On Sun, 01 Jul 2012 13:02:17 +0200 Boszormenyi Zoltan wrote:
>attached is a patch that does $SUBJECT.
>It's a usability enhancement, to take a backup, write
>a minimalistic recovery.conf and start the streaming
>standby in one go.
>Comments?
*[Review of Patch]*
*Basic stuff:*
----------------------
- Patch applies OK
This is not true anymore with a newer GIT version.
Some chunk for pg_basebackup.c was rejected.
- Compiles cleanly with no warnings
*What it does:*
-------------------------
The pg_basebackup tool does the backup of Cluster from server to the specified
location.
This new functionality will also writes the recovery.conf in the database directory and
start the standby server based on options passed to pg_basebackup.
*Usability*
*----------------*
For usability aspect, I am not very sure how many users would like to start the standby
server using basebackup.
Also, Magnus raised the point that it wouldn't really work on MS Windows
where you *have to* start the service via OS facilities. This part of the patch
was killed.
According to me it can be useful for users who have automated scripts to start server
after backup can use this feature.
Well, scripting is not portable across UNIXes and Windows,
you have to spell out starting the server differently.
*Feature Testing:*
-----------------------------
1. Test pg_basebackup with option -R to check that the recovery.conf file is written to
data directory.
--recovery.conf file is created in data directory.
2. Test pg_basebackup with option -R to check that the recovery.conf file is not able to
create because of disk full.
--Error is given as recovery.conf file is not able to create.
3. Test pg_basebackup with option -S to check the standby server start on the
same/different machine.
--Starting standby server is success in if pg_basebackup is taken in
different machine.
4. Test pg_basebackup with both options -S and -R to check the standby server start on
same/different machine.
--Starting standby server is success in if pg_basebackup is taken in
different machine.
5. Test pg_basebackup with option -S including -h, -U, -p, -w and -W to check the standy
server start
and verify the recovery.conf which is created in data directory.
--Except password, rest of the primary connection info parameters are working
fine.
The password part is now fixed.
6. Test pg_basebackup with conflict options (-x or -X and -R or -S).
--Error is given when the conflict options are provided to pg_basebackup.
7. Test pg_basebackup with option -S where pg_ctl/postmaster binaries are not present in
the path.
--Error is given as not able to execute.
8. Test pg_basebackup with option -S by connecting to a standby server.
--standby server is started successfully when pg_basebackup is made from a standby
server also.
*Code Review:*
----------------------------
1. In function WriteRecoveryConf(), un-initialized filename is used.
due to which it can print junk for below line in code
printf("add password to primary_conninfo in %s if needed\n", filename);
Fixed.
2. In function WriteRecoveryConf(), in below code if fopen fails (due to disk full or
any other file related error) it will print the error and exits.
So now it can be confusing to user, in respect to can he consider backup as
successfull and proceed. IMO, either error meesage or documentation
can suggest the for such error user can proceed with backup to write his own
recovery.conf and start the standby.
+ cf = fopen(filename, "w");
+ if (cf == NULL)
+ {
+ fprintf(stderr, _("cannot create %s"), filename);
+ exit(1);
+ }
But BaseBackup() function did indicate already that it has
finished successfully with
if (verbose)
fprintf(stderr, "%s: base backup completed\n", progname);
Would it be an expected (as in: not confusing) behaviour to return 0
from pg_basebackup if the backup itself has finished, but failed to write
the recovery.conf or start the standby if those were requested?
I have modified my WriteRecoveryConf() to do exit(2) instead of exit(1)
to indicate a different error. exit(1) seems to be for reporting configuration
or connection errors. (I may be mistaken though.)
3. In function main,
instead of the following code it can be changed in two different ways,
if (startstandby)
writerecoveryconf = true;
change1:
case 'R':
writerecoveryconf = true;
break;
case 'S':
startstandby = true;
writerecoveryconf = true;
break;
change2:
case 'S':
startstandby = true;
case 'R':
writerecoveryconf = true;
break;
I went with your second variant at first but it's not needed anymore
as only "-R" exists.
4. The password is not written to primary_conninfo even if the dbpassword is present
because of this reason
connecting to the primary is failing because of authentication failure.
Fixed.
5. write the function header for the newly added functions.
Fixed.
6. execvp function is deprecated beginning in Visual C++ 2005. which is used to fork the
pg_ctl process.
http://msdn.microsoft.com/en-us/library/ms235414.aspx
This issue is now irrelevant as the standby is not started, there is no "-S"
option.
7. In StartStandby function, it is better to free the memory allocated for path (path =
xstrdup(command);)
Same as above.
*Defects:*
*-------------*
1. If the pg_basebackup is used in the same machine with the option of -S, the standby
server start
will fail as the port already in use because of using the same postgresql.conf.
Well, running initdb twice on the same machine with different data directories
would also cause the second server fail to start because of the same issue
and it's not called a bug. I think this is irrelevant as is and also because
there
is no "-S" now.
2. If the hot_standby=off in master conf file, the same is copied to subscriber and
starts the server. with that
no client connections are allowed to the server.
Well, it simply copies the source server behaviour, which can also be a
replication standby. PostgreSQL has cascading replication, you know.
*Documentation issues:*
*--------------------------------*
1. For -R option,
Conflicts with <option>--xlog
I think it is better to explain the reason of conflict.
Fixed.
2. For -S option,
"Start the standby database server. Implies -R option."
I think the above can be improved to
"Writes the recovery.conf and start the standby database server. There is no need for
user to specify -R option explicitly."
or something similar.
Not relevant anymore.
Again, thanks for the review.
The second generation of this work is now attached and contains a new
feature as was discussed and suggested by Magnus Hagander, Fujii Masao
and Peter Eisentraut. So libpq has grown a new function:
+/* return the connection options used by a live connection */
+extern PQconninfoOption *PQconninfo(PGconn *conn);
This copies all the connection parameters back from the live PGconn itself
so everything that's needed to connect is already validated.
This is used by the second patch which makes the changes in pg_basebackup
simpler and not hardcoded.
Please, review.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt
--- postgresql/src/interfaces/libpq/exports.txt 2012-08-03 09:39:30.118266598 +0200
+++ postgresql.1/src/interfaces/libpq/exports.txt 2012-10-03 17:17:06.344766325 +0200
@@ -161,3 +161,4 @@ PQping 158
PQpingParams 159
PQlibVersion 160
PQsetSingleRowMode 161
+PQconninfo 162
diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c
--- postgresql/src/interfaces/libpq/fe-connect.c 2012-09-09 08:11:09.470401480 +0200
+++ postgresql.1/src/interfaces/libpq/fe-connect.c 2012-10-03 15:46:44.357491440 +0200
@@ -318,6 +318,8 @@ static char *conninfo_uri_decode(const c
static bool get_hexdigit(char digit, int *value);
static const char *conninfo_getval(PQconninfoOption *connOptions,
const char *keyword);
+static void conninfo_setval(PQconninfoOption *connOptions,
+ const char *keyword, const char *val);
static PQconninfoOption *conninfo_storeval(PQconninfoOption *connOptions,
const char *keyword, const char *value,
PQExpBuffer errorMessage, bool ignoreMissing, bool uri_decode);
@@ -4985,6 +4987,24 @@ conninfo_getval(PQconninfoOption *connOp
}
/*
+ * Set an option value corresponding to the keyword in the connOptions array.
+ */
+static void
+conninfo_setval(PQconninfoOption *connOptions, const char *keyword,
+ const char *val)
+{
+ PQconninfoOption *option;
+
+ option = conninfo_find(connOptions, keyword);
+ if (option)
+ {
+ if (option->val)
+ free(option->val);
+ option->val = val ? strdup(val) : NULL;
+ }
+}
+
+/*
* Store a (new) value for an option corresponding to the keyword in
* connOptions array.
*
@@ -5066,6 +5086,69 @@ conninfo_find(PQconninfoOption *connOpti
}
+/*
+ * Return the connection options used for the connections
+ */
+PQconninfoOption *
+PQconninfo(PGconn *conn)
+{
+ PQExpBufferData errorBuf;
+ PQconninfoOption *connOptions;
+
+ if (conn == NULL)
+ return NULL;
+
+ /* We don't actually report any errors here, but callees want a buffer */
+ initPQExpBuffer(&errorBuf);
+ if (PQExpBufferDataBroken(errorBuf))
+ return NULL; /* out of memory already :-( */
+
+ connOptions = conninfo_init(&errorBuf);
+
+ termPQExpBuffer(&errorBuf);
+
+ /*
+ * Move conn values into option structure
+ */
+ conninfo_setval(connOptions, "hostaddr", conn->pghostaddr);
+ conninfo_setval(connOptions, "host", conn->pghost);
+ conninfo_setval(connOptions, "port", conn->pgport);
+ conninfo_setval(connOptions, "tty", conn->pgtty);
+ conninfo_setval(connOptions, "options", conn->pgoptions);
+ conninfo_setval(connOptions, "application_name", conn->appname);
+ conninfo_setval(connOptions, "fallback_application_name", conn->fbappname);
+ conninfo_setval(connOptions, "dbname", conn->dbName);
+ conninfo_setval(connOptions, "user", conn->pguser);
+ conninfo_setval(connOptions, "password", conn->pgpass);
+ conninfo_setval(connOptions, "connect_timeout", conn->connect_timeout);
+ conninfo_setval(connOptions, "client_encoding", conn->client_encoding_initial);
+ conninfo_setval(connOptions, "keepalives", conn->keepalives);
+ conninfo_setval(connOptions, "keepalives_idle", conn->keepalives_idle);
+ conninfo_setval(connOptions, "keepalives_interval", conn->keepalives_interval);
+ conninfo_setval(connOptions, "keepalives_count", conn->keepalives_count);
+ conninfo_setval(connOptions, "sslmode", conn->sslmode);
+ conninfo_setval(connOptions, "sslcompression", conn->sslcompression);
+ conninfo_setval(connOptions, "sslkey", conn->sslkey);
+ conninfo_setval(connOptions, "sslcert", conn->sslcert);
+ conninfo_setval(connOptions, "sslrootcert", conn->sslrootcert);
+ conninfo_setval(connOptions, "sslcrl", conn->sslcrl);
+#ifdef USE_SSL
+ conninfo_setval(connOptions, "requiressl",
+ (conn->sslmode && strcmp(conn->sslmode, "require") == 0 ? "1" : "0"));
+#endif
+ conninfo_setval(connOptions, "requirepeer", conn->requirepeer);
+#if defined(KRB5) || defined(ENABLE_GSS) || defined(ENABLE_SSPI)
+ conninfo_setval(connOptions, "krbsrvname", conn->krbsrvname);
+#endif
+#if defined(ENABLE_GSS) && defined(ENABLE_SSPI)
+ conninfo_setval(connOptions, "gsslib", conn->gsslib);
+#endif
+ conninfo_setval(connOptions, "replication", conn->replication);
+
+ return connOptions;
+}
+
+
void
PQconninfoFree(PQconninfoOption *connOptions)
{
diff -durpN postgresql/src/interfaces/libpq/libpq-fe.h postgresql.1/src/interfaces/libpq/libpq-fe.h
--- postgresql/src/interfaces/libpq/libpq-fe.h 2012-08-03 09:39:30.122266626 +0200
+++ postgresql.1/src/interfaces/libpq/libpq-fe.h 2012-10-03 15:00:45.458355068 +0200
@@ -262,6 +262,9 @@ extern PQconninfoOption *PQconndefaults(
/* parse connection options in same way as PQconnectdb */
extern PQconninfoOption *PQconninfoParse(const char *conninfo, char **errmsg);
+/* return the connection options used by a live connection */
+extern PQconninfoOption *PQconninfo(PGconn *conn);
+
/* free the data structure returned by PQconndefaults() or PQconninfoParse() */
extern void PQconninfoFree(PQconninfoOption *connOptions);
diff -durpN postgresql.1/doc/src/sgml/ref/pg_basebackup.sgml postgresql.2/doc/src/sgml/ref/pg_basebackup.sgml
--- postgresql.1/doc/src/sgml/ref/pg_basebackup.sgml 2012-08-24 09:49:22.960530329 +0200
+++ postgresql.2/doc/src/sgml/ref/pg_basebackup.sgml 2012-10-03 18:08:44.940909214 +0200
@@ -189,6 +189,27 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>-R</option></term>
+ <term><option>--write-recovery-conf</option></term>
+ <listitem>
+ <para>
+ Write a minimal recovery.conf into the output directory using
+ the connection parameters from the command line to ease
+ setting up the standby. Since creating a backup for a standalone
+ server with <option>-x</option> or <option>-X</option> and adding
+ a recovery.conf to it wouldn't make a working standby, these options
+ naturally conflict.
+ </para>
+ <para>
+ When this option is specified and taking the base backup succeeded,
+ failing to write the <filename>recovery.conf</filename> results
+ in the error code 2. This way, scripts can distinguish between different
+ failure cases of <application>pg_basebackup</application>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-x</option></term>
<term><option>--xlog</option></term>
<listitem>
diff -durpN postgresql.1/src/bin/pg_basebackup/pg_basebackup.c postgresql.2/src/bin/pg_basebackup/pg_basebackup.c
--- postgresql.1/src/bin/pg_basebackup/pg_basebackup.c 2012-10-03 10:40:48.297207389 +0200
+++ postgresql.2/src/bin/pg_basebackup/pg_basebackup.c 2012-10-03 17:42:41.530270627 +0200
@@ -46,6 +46,7 @@ int compresslevel = 0;
bool includewal = false;
bool streamwal = false;
bool fastcheckpoint = false;
+bool writerecoveryconf = false;
int standby_message_timeout = 10 * 1000; /* 10 sec = default */
/* Progress counters */
@@ -70,14 +71,18 @@ static int has_xlogendptr = 0;
static volatile LONG has_xlogendptr = 0;
#endif
+static PQconninfoOption *connOptions = NULL;
+
/* Function headers */
static void usage(void);
static void verify_dir_is_empty_or_create(char *dirname);
static void progress_report(int tablespacenum, const char *filename);
+static void stderr_write_error(FILE *cf, char *filename);
static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
static void BaseBackup(void);
+static void WriteRecoveryConf(void);
static bool reached_end_position(XLogRecPtr segendpos, uint32 timeline,
bool segment_finished);
@@ -107,6 +112,8 @@ usage(void)
printf(_("\nOptions controlling the output:\n"));
printf(_(" -D, --pgdata=DIRECTORY receive base backup into directory\n"));
printf(_(" -F, --format=p|t output format (plain (default), tar)\n"));
+ printf(_(" -R, --write-recovery-conf\n"
+ " write recovery.conf after backup\n"));
printf(_(" -x, --xlog include required WAL files in backup (fetch mode)\n"));
printf(_(" -X, --xlog-method=fetch|stream\n"
" include required WAL files with specified method\n"));
@@ -960,6 +967,10 @@ BaseBackup(void)
/* Error message already written in GetConnection() */
exit(1);
+ /* If recovery.conf is to be written, keep the connection parameters for later usage */
+ if (writerecoveryconf)
+ connOptions = PQconninfo(conn);
+
/*
* Run IDENTIFY_SYSTEM so we can get the timeline
*/
@@ -1234,6 +1245,67 @@ BaseBackup(void)
}
+static void
+stderr_write_error(FILE *cf, char *filename)
+{
+ fprintf(stderr, _("cannot write to %s: %s"), filename, strerror(errno));
+ fclose(cf);
+ unlink(filename);
+ exit(2);
+}
+
+
+/*
+ * Attempt to create recovery.conf and write the expected contents to it.
+ */
+static void
+WriteRecoveryConf(void)
+{
+ char filename[MAXPGPATH];
+ PQconninfoOption *option;
+ FILE *cf;
+
+ if (!writerecoveryconf)
+ return;
+
+ sprintf(filename, "%s/recovery.conf", basedir);
+
+ cf = fopen(filename, "w");
+ if (cf == NULL)
+ {
+ fprintf(stderr, _("cannot create %s: %s"), filename, strerror(errno));
+ exit(2);
+ }
+
+ if (fprintf(cf, "standby_mode = 'on'\n") < 0)
+ stderr_write_error(cf, filename);
+
+ if (fprintf(cf, "primary_conninfo = '") < 0)
+ stderr_write_error(cf, filename);
+
+ for (option = connOptions; option && option->keyword; option++)
+ {
+ /* Do not emit this setting if not set, empty or default */
+ if (option->val == NULL ||
+ (option->val != NULL && option->val[0] == '\0') ||
+ (option->val &&
+ option->compiled &&
+ strcmp(option->val, option->compiled) == 0))
+ continue;
+
+ /* write "keyword='value'" pieces, single quotes doubled */
+ if (fprintf(cf, "%s=''%s'' ", option->keyword, option->val) < 0)
+ stderr_write_error(cf, filename);
+ }
+
+ if (fprintf(cf, "'\n") < 0)
+ stderr_write_error(cf, filename);
+
+ PQconninfoFree(connOptions);
+
+ fclose(cf);
+}
+
int
main(int argc, char **argv)
{
@@ -1243,6 +1315,7 @@ main(int argc, char **argv)
{"pgdata", required_argument, NULL, 'D'},
{"format", required_argument, NULL, 'F'},
{"checkpoint", required_argument, NULL, 'c'},
+ {"write-recovery-conf", no_argument, NULL, 'R'},
{"xlog", no_argument, NULL, 'x'},
{"xlog-method", required_argument, NULL, 'X'},
{"gzip", no_argument, NULL, 'z'},
@@ -1280,7 +1353,7 @@ main(int argc, char **argv)
}
}
- while ((c = getopt_long(argc, argv, "D:F:xX:l:zZ:c:h:p:U:s:wWvP",
+ while ((c = getopt_long(argc, argv, "D:F:RxX:l:zZ:c:h:p:U:s:wWvP",
long_options, &option_index)) != -1)
{
switch (c)
@@ -1301,6 +1374,9 @@ main(int argc, char **argv)
exit(1);
}
break;
+ case 'R':
+ writerecoveryconf = true;
+ break;
case 'x':
if (includewal)
{
@@ -1466,6 +1542,13 @@ main(int argc, char **argv)
}
#endif
+ if (writerecoveryconf && includewal)
+ {
+ fprintf(stderr,
+ _("--xlog and --writerecoveryconf are mutually exclusive\n"));
+ exit(1);
+ }
+
/*
* Verify that the target directory exists, or create it. For plaintext
* backups, always require the directory. For tar backups, require it
@@ -1476,5 +1559,7 @@ main(int argc, char **argv)
BaseBackup();
+ WriteRecoveryConf();
+
return 0;
}
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers