Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
At 2015-06-11 14:38:03 +0900, langote_amit...@lab.ntt.co.jp wrote: > > > On the other hand, I don't like the idea of doing (3) by adding > > command line arguments to pg_basebackup and adding a new option to > > the command. I don't think that level of "flexibility" is justified; > > it would also make it easier to end up with a broken base backup (by > > inadvertently excluding more than you meant to). > > Maybe a combination of (2) and part of (3). In absence of any command > line argument, the behavior is (2), to exclude. Provide an option to > *include* it (-S/--serverlog). I don't like that idea any more than having the command-line argument to exclude pg_log. (And people who store torrented files in PGDATA may like it even less.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
Michael Paquier wrote: > After spending the night thinking about that, honestly, I think that > we should go with (2) and keep the base backup as light-weight as > possible and not bother about a GUC. (3) would need some extra > intelligence to decide if some files can be skipped or not. Imagine > for example --skip-files=global/pg_control or --skip-files=pg_clog > (because it *is* a log file with much data), that would just corrupt > silently your backup, but I guess that it is what you had in mind. In > any case (3) is not worth the maintenance burden because we would need > to update the things to filter each time a new important folder is > added in PGDATA by a patch. If somebody sets log_directory=pg_clog/ they are screwed pretty badly, aren't they. (I guess this is just a case of "don't do that"). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Jun 11, 2015 7:38 AM, "Amit Langote" wrote: > > On 2015-06-11 PM 02:20, Abhijit Menon-Sen wrote: > > At 2015-06-10 13:22:27 -0400, robertmh...@gmail.com wrote: > >> > >> (1) include pg_log in pg_basebackup as we do currently > >> (2) exclude it > >> (3) add a switch controlling whether or not it gets excluded > >> > >> I can live with (3), but I bet most people want (2). > > > > Thanks for spelling out the options. > > > > I strongly prefer (2), but I could live with (3) if it were done as a > > GUC setting. (And if that's what we decide to do, I'm willing to write > > up the patch.) > > > > Whether or not it's a good idea to let one's logfiles grow to >8GB, the > > fact that doing so breaks base backups means that being able to exclude > > pg_log *somehow* is more of a necessity than personal preference. > > > > On the other hand, I don't like the idea of doing (3) by adding command > > line arguments to pg_basebackup and adding a new option to the command. > > I don't think that level of "flexibility" is justified; it would also > > make it easier to end up with a broken base backup (by inadvertently > > excluding more than you meant to). > > > > Maybe a combination of (2) and part of (3). In absence of any command line > argument, the behavior is (2), to exclude. Provide an option to *include* it > (-S/--serverlog) I think it's useful enough to have a switch, but no problem to exclude it by default. So I can definitely go for Amits suggestions. I also don't feel strongly enough about it to put up any kind of fight if the majority wants different :-) /Magnus
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Thu, Jun 11, 2015 at 2:39 PM, Abhijit Menon-Sen wrote: > At 2015-06-11 14:28:36 +0900, michael.paqu...@gmail.com wrote: >> >> After spending the night thinking about that, honestly, I think that >> we should go with (2) and keep the base backup as light-weight as >> possible and not bother about a GUC. > > OK. Then the patch I posted earlier should be sufficient. Btw, one thing that 010_pg_basebackup.pl does not check is actually if the files filtered by basebackup.c are included or not in the base backup. We may want to add some extra checks regarding that... Especially with your patch that filters things depending on if log_directory is an absolute path or not. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On 2015-06-11 PM 02:20, Abhijit Menon-Sen wrote: > At 2015-06-10 13:22:27 -0400, robertmh...@gmail.com wrote: >> >> (1) include pg_log in pg_basebackup as we do currently >> (2) exclude it >> (3) add a switch controlling whether or not it gets excluded >> >> I can live with (3), but I bet most people want (2). > > Thanks for spelling out the options. > > I strongly prefer (2), but I could live with (3) if it were done as a > GUC setting. (And if that's what we decide to do, I'm willing to write > up the patch.) > > Whether or not it's a good idea to let one's logfiles grow to >8GB, the > fact that doing so breaks base backups means that being able to exclude > pg_log *somehow* is more of a necessity than personal preference. > > On the other hand, I don't like the idea of doing (3) by adding command > line arguments to pg_basebackup and adding a new option to the command. > I don't think that level of "flexibility" is justified; it would also > make it easier to end up with a broken base backup (by inadvertently > excluding more than you meant to). > Maybe a combination of (2) and part of (3). In absence of any command line argument, the behavior is (2), to exclude. Provide an option to *include* it (-S/--serverlog). Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
At 2015-06-11 14:28:36 +0900, michael.paqu...@gmail.com wrote: > > After spending the night thinking about that, honestly, I think that > we should go with (2) and keep the base backup as light-weight as > possible and not bother about a GUC. OK. Then the patch I posted earlier should be sufficient. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Thu, Jun 11, 2015 at 2:20 PM, Abhijit Menon-Sen wrote: > At 2015-06-10 13:22:27 -0400, robertmh...@gmail.com wrote: >> >> I'm not clear on which of these options you are voting for: >> >> (1) include pg_log in pg_basebackup as we do currently >> (2) exclude it >> (3) add a switch controlling whether or not it gets excluded >> >> I can live with (3), but I bet most people want (2). > > Thanks for spelling out the options. > > I strongly prefer (2), but I could live with (3) if it were done as a > GUC setting. (And if that's what we decide to do, I'm willing to write > up the patch.) > > Whether or not it's a good idea to let one's logfiles grow to >8GB, the > fact that doing so breaks base backups means that being able to exclude > pg_log *somehow* is more of a necessity than personal preference. > > On the other hand, I don't like the idea of doing (3) by adding command > line arguments to pg_basebackup and adding a new option to the command. > I don't think that level of "flexibility" is justified; it would also > make it easier to end up with a broken base backup (by inadvertently > excluding more than you meant to). After spending the night thinking about that, honestly, I think that we should go with (2) and keep the base backup as light-weight as possible and not bother about a GUC. (3) would need some extra intelligence to decide if some files can be skipped or not. Imagine for example --skip-files=global/pg_control or --skip-files=pg_clog (because it *is* a log file with much data), that would just corrupt silently your backup, but I guess that it is what you had in mind. In any case (3) is not worth the maintenance burden because we would need to update the things to filter each time a new important folder is added in PGDATA by a patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
At 2015-06-10 13:22:27 -0400, robertmh...@gmail.com wrote: > > I'm not clear on which of these options you are voting for: > > (1) include pg_log in pg_basebackup as we do currently > (2) exclude it > (3) add a switch controlling whether or not it gets excluded > > I can live with (3), but I bet most people want (2). Thanks for spelling out the options. I strongly prefer (2), but I could live with (3) if it were done as a GUC setting. (And if that's what we decide to do, I'm willing to write up the patch.) Whether or not it's a good idea to let one's logfiles grow to >8GB, the fact that doing so breaks base backups means that being able to exclude pg_log *somehow* is more of a necessity than personal preference. On the other hand, I don't like the idea of doing (3) by adding command line arguments to pg_basebackup and adding a new option to the command. I don't think that level of "flexibility" is justified; it would also make it easier to end up with a broken base backup (by inadvertently excluding more than you meant to). -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On 06/10/2015 10:22 AM, Robert Haas wrote: On Wed, Jun 10, 2015 at 1:12 PM, Joshua D. Drake wrote: On 06/10/2015 10:01 AM, Andres Freund wrote: On 2015-06-10 09:57:17 -0700, Jeff Janes wrote: Mine goal isn't that. My goal is to have a consistent backup without having to shut down the server to take a cold one, or having to manually juggle the pg_start_backup, etc. commands. A basebackup won't necessarily give you a consistent log though... I am -1 on this idea. It just doesn't seem to make sense. There are too many variables where it won't work or won't be relevant. I'm not clear on which of these options you are voting for: (1) include pg_log in pg_basebackup as we do currently (2) exclude it (3) add a switch controlling whether or not it gets excluded I can live with (3), but I bet most people want (2). Sorry I wasn't clear. #2 Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Wed, Jun 10, 2015 at 1:12 PM, Joshua D. Drake wrote: > On 06/10/2015 10:01 AM, Andres Freund wrote: >> On 2015-06-10 09:57:17 -0700, Jeff Janes wrote: >>> Mine goal isn't that. My goal is to have a consistent backup without >>> having to shut down the server to take a cold one, or having to manually >>> juggle the pg_start_backup, etc. commands. >> >> A basebackup won't necessarily give you a consistent log though... > > I am -1 on this idea. It just doesn't seem to make sense. There are too many > variables where it won't work or won't be relevant. I'm not clear on which of these options you are voting for: (1) include pg_log in pg_basebackup as we do currently (2) exclude it (3) add a switch controlling whether or not it gets excluded I can live with (3), but I bet most people want (2). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On 06/10/2015 10:01 AM, Andres Freund wrote: On 2015-06-10 09:57:17 -0700, Jeff Janes wrote: Mine goal isn't that. My goal is to have a consistent backup without having to shut down the server to take a cold one, or having to manually juggle the pg_start_backup, etc. commands. A basebackup won't necessarily give you a consistent log though... I am -1 on this idea. It just doesn't seem to make sense. There are too many variables where it won't work or won't be relevant. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On 2015-06-10 09:57:17 -0700, Jeff Janes wrote: > Mine goal isn't that. My goal is to have a consistent backup without > having to shut down the server to take a cold one, or having to manually > juggle the pg_start_backup, etc. commands. A basebackup won't necessarily give you a consistent log though... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Wed, Jun 10, 2015 at 8:29 AM, Robert Haas wrote: > On Mon, Jun 8, 2015 at 12:09 AM, Michael Paquier > wrote: > >> Recently, one of our customers has had a basebackup fail because pg_log > >> contained files that were >8GB: > >> FATAL: archive member "pg_log/postgresql-20150119.log" too large for > tar format > >> > >> I think pg_basebackup should also skip pg_log entries, as it does for > >> pg_stats_temp and pg_replslot, etc. I've attached a patch along those > >> lines for discussion. > > > > And a recent discussion about that is this one: > > > http://www.postgresql.org/message-id/82897A1301080E4B8E461DDAA0FFCF142A1B2660@SYD1216 > > Bringing the point: some users may want to keep log files in a base > > backup, and some users may want to skip some of them, and not only > > pg_log. Hence we may want more flexibility than what is proposed here. > > That seems pretty thin. If you're taking a base backup, your goal is > to create a standby. Mine goal isn't that. My goal is to have a consistent backup without having to shut down the server to take a cold one, or having to manually juggle the pg_start_backup, etc. commands. I do occasionally use it start up a standby for training/testing purposes, but mostly it is for D-R (in which I would rather have the logs) and for cloning test/dev/QA environments (in which case I go delete the logs if I don't want them) > Copying logs is in no way an integral part of > that, and we would not copy them if they were stored outside the data > directory. If we accept the proposal that this needs to be more > complicated, will we also accept a proposal to make pg_basebackup > include relevant files from /var/log when the PostgreSQL logs are > stored there? > I think it is pretty intuitive that if you have your logs go to pg_log, they get backed up with the other pg_ stuff, and if you change it go elsewhere, then you need to handle it yourself. Cheers, Jeff
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Mon, Jun 8, 2015 at 12:09 AM, Michael Paquier wrote: >> Recently, one of our customers has had a basebackup fail because pg_log >> contained files that were >8GB: >> FATAL: archive member "pg_log/postgresql-20150119.log" too large for tar >> format >> >> I think pg_basebackup should also skip pg_log entries, as it does for >> pg_stats_temp and pg_replslot, etc. I've attached a patch along those >> lines for discussion. > > And a recent discussion about that is this one: > http://www.postgresql.org/message-id/82897A1301080E4B8E461DDAA0FFCF142A1B2660@SYD1216 > Bringing the point: some users may want to keep log files in a base > backup, and some users may want to skip some of them, and not only > pg_log. Hence we may want more flexibility than what is proposed here. That seems pretty thin. If you're taking a base backup, your goal is to create a standby. Copying logs is in no way an integral part of that, and we would not copy them if they were stored outside the data directory. If we accept the proposal that this needs to be more complicated, will we also accept a proposal to make pg_basebackup include relevant files from /var/log when the PostgreSQL logs are stored there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
On Mon, Jun 8, 2015 at 12:42 PM, Abhijit Menon-Sen wrote: > This is a followup to a 2014-02 discussion that led to pg_stats_temp > being excluded from pg_basebackup. At the time, it was discussed to > exclude pg_log as well, but nothing eventually came of that. It seems to be that: http://www.postgresql.org/message-id/cahgqgwh0okz6ckpjkcwojga3ejwffm1enrmro3dkdoteaai...@mail.gmail.com > Recently, one of our customers has had a basebackup fail because pg_log > contained files that were >8GB: > FATAL: archive member "pg_log/postgresql-20150119.log" too large for tar > format > > I think pg_basebackup should also skip pg_log entries, as it does for > pg_stats_temp and pg_replslot, etc. I've attached a patch along those > lines for discussion. And a recent discussion about that is this one: http://www.postgresql.org/message-id/82897A1301080E4B8E461DDAA0FFCF142A1B2660@SYD1216 Bringing the point: some users may want to keep log files in a base backup, and some users may want to skip some of them, and not only pg_log. Hence we may want more flexibility than what is proposed here. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
Hi. This is a followup to a 2014-02 discussion that led to pg_stats_temp being excluded from pg_basebackup. At the time, it was discussed to exclude pg_log as well, but nothing eventually came of that. Recently, one of our customers has had a basebackup fail because pg_log contained files that were >8GB: FATAL: archive member "pg_log/postgresql-20150119.log" too large for tar format I think pg_basebackup should also skip pg_log entries, as it does for pg_stats_temp and pg_replslot, etc. I've attached a patch along those lines for discussion. -- Abhijit P.S. Aren't we leaking statrelpath? >From 8db162c1385b1cdd2b0e666975b76aa814f09f35 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Mon, 27 Apr 2015 12:58:52 +0530 Subject: Skip files in pg_log during basebackup --- src/backend/replication/basebackup.c | 29 + 1 file changed, 29 insertions(+) diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 1e86e4c..cc75a03 100644 --- a/src/backend/replication/basebackup.c +++ b/src/backend/replication/basebackup.c @@ -27,6 +27,7 @@ #include "nodes/pg_list.h" #include "pgtar.h" #include "pgstat.h" +#include "postmaster/syslogger.h" #include "replication/basebackup.h" #include "replication/walsender.h" #include "replication/walsender_private.h" @@ -72,6 +73,9 @@ static bool backup_started_in_recovery = false; /* Relative path of temporary statistics directory */ static char *statrelpath = NULL; +/* Relative path to log directory */ +static char logpath[MAXPGPATH]; + /* * Size of each block sent into the tar stream for larger files. */ @@ -157,6 +161,18 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) else statrelpath = pgstat_stat_directory; + /* + * Do the same for the log_directory. + */ + + if (is_absolute_path(Log_directory) && + path_is_prefix_of_path(DataDir, Log_directory)) + snprintf(logpath, MAXPGPATH, "./%s", Log_directory + datadirpathlen + 1); + else if (strncmp(Log_directory, "./", 2) != 0) + snprintf(logpath, MAXPGPATH, "./%s", Log_directory); + else + strncpy(logpath, Log_directory, MAXPGPATH); + /* Add a node for the base directory at the end */ ti = palloc0(sizeof(tablespaceinfo)); ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, true) : -1; @@ -965,6 +981,19 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces, } /* + * We can skip pg_log (or whatever log_directory is set to, if + * it's under the data directory), but include it as an empty + * directory anyway, so we get permissions right. + */ + if (strcmp(pathbuf, logpath) == 0) + { + if (!sizeonly) +_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf); + size += 512; + continue; + } + + /* * Skip pg_replslot, not useful to copy. But include it as an empty * directory anyway, so we get permissions right. */ -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers