Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second
On Thu, Jan 9, 2014, at 20:05, Magnus Hagander wrote: > On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta wrote: >> On 13 Nov 2013, at 20:51, Mika Eloranta wrote: >> >> >> > Prevent excessive progress reporting that can grow to gigabytes >> > of output with large databases. >> >> >> Same patch as an attachment. > > Would it not make more sense to instead store the last number printed, and > only print it if the percentage has changed? AIUI with this patch we still > print the same thing on top of itself if it takes >1 second to get 1% further. > > (Except for verbose mode - but if you're asking for verbose mode, you are > *asking* to get lots of output) Printing out progress periodically is IMHO slightly better as the interactive user can see that there is some progress (e.g. by pressing enter for a new empty console line) during a huge basebackup operation. The original problem I wanted to address was that I had a daemon watching over the basebackup process and reading its output to make sure that the basebackup is proceeding properly. It also wrote all the output to a logfile for postmortem analysis. The log file grew to a very big size (multiple gigabytes due to the progress prints). With my patch the log was only a few kilos, without any negative effects. The excessive progress reporting can also be an issue in an interactive session over a slow network (mobile), hogging both time and bandwidth. Cheers, Mika -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second
On Thu, Jan 9, 2014, at 20:05, Magnus Hagander wrote: > On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta wrote: >> On 13 Nov 2013, at 20:51, Mika Eloranta wrote: >> >> >> > Prevent excessive progress reporting that can grow to gigabytes >> > of output with large databases. >> >> >> Same patch as an attachment. > > Would it not make more sense to instead store the last number printed, and > only print it if the percentage has changed? AIUI with this patch we still > print the same thing on top of itself if it takes >1 second to get 1% further. > > (Except for verbose mode - but if you're asking for verbose mode, you are > *asking* to get lots of output) (re-sent response as I used the wrong sender address previously, sorry about the dupe) Printing out progress periodically is IMHO slightly better as the interactive user can see that there is some progress (e.g. by pressing enter for a new empty console line) during a huge basebackup operation. The original problem I wanted to address was that I had a daemon watching over the basebackup process and reading its output to make sure that the basebackup is proceeding properly. It also wrote all the output to a logfile for postmortem analysis. The log file grew to a very big size (multiple gigabytes due to the progress prints). With my patch the log was only a few kilos, without any negative effects. The excessive progress reporting can also be an issue in an interactive session over a slow network (mobile), hogging both time and bandwidth. Cheers, Mika -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.3.2
On 02 Dec 2013, at 01:56, Tom Lane wrote: > > * any other items that need to be corrected or expanded? I think 2103430 (Fix parsing of xlog file name in pg_receivexlog) is worth mentioning, as all past pg_receivexlog 9.3.x versions fail to resume interrupted streaming after ~4 GiB of xlogs have been generated. Cheers, -- Mika Eloranta Ohmu Ltd. http://www.ohmu.fi/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second
On 13 Nov 2013, at 20:51, Mika Eloranta wrote: > Prevent excessive progress reporting that can grow to gigabytes > of output with large databases. Same patch as an attachment. -- Mika Eloranta Ohmu Ltd. http://www.ohmu.fi/ 0001-pg_basebackup-progress-report-max-once-per-second.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] pg_basebackup: progress report max once per second
Prevent excessive progress reporting that can grow to gigabytes of output with large databases. --- src/bin/pg_basebackup/pg_basebackup.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index a1e12a8..90c4683 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -45,6 +45,7 @@ bool streamwal = false; bool fastcheckpoint = false; bool writerecoveryconf = false; intstandby_message_timeout = 10 * 1000;/* 10 sec = default */ +intlast_progress_report = 0; /* Progress counters */ static uint64 totalsize; @@ -74,7 +75,7 @@ static PQExpBuffer recoveryconfcontents = NULL; /* Function headers */ static void usage(void); static void verify_dir_is_empty_or_create(char *dirname); -static void progress_report(int tablespacenum, const char *filename); +static void progress_report(int tablespacenum, const char *filename, int force); static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum); static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum); @@ -399,12 +400,15 @@ verify_dir_is_empty_or_create(char *dirname) * is enabled, also print the current file name. */ static void -progress_report(int tablespacenum, const char *filename) +progress_report(int tablespacenum, const char *filename, int force) { int percent = (int) ((totaldone / 1024) * 100 / totalsize); chartotaldone_str[32]; chartotalsize_str[32]; + if(!showprogress || (time(NULL) == last_progress_report && !force)) return; /* Max once per second */ + last_progress_report = time(NULL); + /* * Avoid overflowing past 100% or the full size. This may make the total * size number change as we approach the end of the backup (the estimate @@ -850,9 +854,9 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum) } } totaldone += r; - if (showprogress) - progress_report(rownum, filename); + progress_report(rownum, filename, 0); } /* while (1) */ + progress_report(rownum, filename, 1); if (copybuf != NULL) PQfreemem(copybuf); @@ -1073,8 +1077,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum) disconnect_and_exit(1); } totaldone += r; - if (showprogress) - progress_report(rownum, filename); + progress_report(rownum, filename, 0); current_len_left -= r; if (current_len_left == 0 && current_padding == 0) @@ -1090,6 +1093,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum) } } /* continuing data in existing file */ } /* loop over all data blocks */ + progress_report(rownum, filename, 1); if (file != NULL) { @@ -1450,8 +1454,7 @@ BaseBackup(void) tablespacecount = PQntuples(res); for (i = 0; i < PQntuples(res); i++) { - if (showprogress) - totalsize += atol(PQgetvalue(res, i, 2)); + totalsize += atol(PQgetvalue(res, i, 2)); /* * Verify tablespace directories are empty. Don't bother with the @@ -1498,7 +1501,7 @@ BaseBackup(void) if (showprogress) { - progress_report(PQntuples(res), NULL); + progress_report(PQntuples(res), NULL, 1); fprintf(stderr, "\n"); /* Need to move to next line */ } PQclear(res); -- 1.8.4.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_receivexlog: fixed to work with logical segno > 0
On Nov 4, 2013, at 11:06, Heikki Linnakangas wrote: > On 01.11.2013 11:42, Mika Eloranta wrote: >> pg_receivexlog calculated the xlog segment number incorrectly >> when started after the previous instance was interrupted. >> >> Resuming streaming only worked when the physical wal segment >> counter was zero, i.e. for the first 256 segments or so. > > Oops. Fixed, thanks for the report! > > It's a bit scary that this bug went unnoticed for this long; it was > introduced quite early in the 9.3 development cycle. Seems that I did all the > testing of streaming timeline changes with pg_receivexlog later in 9.3 cycle > with segment numbers < 256, and no-one else have done long-running tests with > pg_receivexlog either. Thanks for the fix, Heikki! It sounds like either PostgreSQL 9.3.x and/or pg_receivexlog is not yet used in a lot of places. Otherwise this probably would have been found earlier. Affected versions: $ git tag --contains dfda6eba REL9_3_0 REL9_3_1 REL9_3_BETA1 REL9_3_BETA2 REL9_3_RC1 What makes this a really sneaky and severe problem is the way it stays dormant for a period of time after a fresh db init or pg_upgrade. Here's how I bumped into it: 1. Old postgresql 9.2 db running, pg_receivexlog streaming extra backups to a remote box. 2. pg_upgrade to 9.3.1. 3. pg_receivexlog from the upgraded DB still works ok and handles restarts fine, because the xlog indexes were reset back to zero at pg_upgrade. 4. xlog history eventually grows over 256 * 16MB. 5. pg_receivexlog gets interrupted for whatever reason (gets stopped, killed, crashes, host is restarted). 6. A new pg_receivexlog instance fails to resume streaming and there is no easy workaround that would maintain an uninterrupted, gapless xlog history. Initially, before I had analysed the problem any further, I had to stash the xlogs, restart pg_receivexlog and after that trigger new pg_basebackups. Regardless of this bug, I find that pg_receivexlog (and pg_basebackup) are excellent tools and people should use them more! PS. something like "pg_receivexlog --start-pos=2D/15000000" might be nice for overriding the streaming start position. -- Mika Eloranta Ohmu Ltd. http://www.ohmu.fi/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] pg_receivexlog: fixed to work with logical segno > 0
pg_receivexlog calculated the xlog segment number incorrectly when started after the previous instance was interrupted. Resuming streaming only worked when the physical wal segment counter was zero, i.e. for the first 256 segments or so. --- src/bin/pg_basebackup/pg_receivexlog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c index 031ec1a..6f9fcf4 100644 --- a/src/bin/pg_basebackup/pg_receivexlog.c +++ b/src/bin/pg_basebackup/pg_receivexlog.c @@ -171,7 +171,7 @@ FindStreamingStart(uint32 *tli) progname, dirent->d_name); disconnect_and_exit(1); } - segno = ((uint64) log) << 32 | seg; + segno = (((uint64) log) << 8) | seg; /* * Check that the segment has the right size, if it's supposed to be -- 1.8.0.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PL/Python SQL error code pass-through
Hi all, Here's a little SQL snippet that exposes an apparent regression in the 9.1.x PL/Python behavior: ---clip--- # cat foo.sql \set VERBOSITY 'verbose' CREATE table bar (a INTEGER CONSTRAINT hello CHECK (a > 1)); CREATE OR REPLACE FUNCTION foo () RETURNS integer AS $$ plpy.execute("INSERT INTO bar (a) VALUES (2)") plpy.execute("INSERT INTO bar (a) VALUES (1)") return 123 $$ LANGUAGE plpythonu; SELECT * FROM foo(); ---clip--- PostgreSQL 9.0 behavior: ---clip--- # psql < foo.sql CREATE TABLE CREATE FUNCTION WARNING: 01000: PL/Python: plpy.SPIError: unrecognized error in PLy_spi_execute_query CONTEXT: PL/Python function "foo" LOCATION: PLy_elog, plpython.c:3532 ERROR: 23514: new row for relation "bar" violates check constraint "hello" CONTEXT: SQL statement "INSERT INTO bar (a) VALUES (1)" PL/Python function "foo" LOCATION: ExecConstraints, execMain.c:1330 ---clip--- Note the proper 23514 error code. PostgreSQL 9.1.1 behavior: ---clip--- # psql < foo.sql ERROR: 42P07: relation "bar" already exists LOCATION: heap_create_with_catalog, heap.c:1011 CREATE FUNCTION ERROR: XX000: spiexceptions.CheckViolation: new row for relation "bar" violates check constraint "hello" CONTEXT: Traceback (most recent call last): PL/Python function "foo", line 3, in plpy.execute("INSERT INTO bar (a) VALUES (1)") PL/Python function "foo" LOCATION: PLy_elog, plpython.c:4502 ---clip--- In fact, all SQL error that occur within PL/Python seem to be returned with the "XX000" error code. This is a bit of a problem for client-side logic that detects e.g. constraint violations based on the SQL error code. A small patch that includes passing thru the SQL error code is attached. Test run with PostgreSQL 9.1.1 + patch: ---clip--- # psql < foo.sql ERROR: 42P07: relation "bar" already exists LOCATION: heap_create_with_catalog, heap.c:1011 CREATE FUNCTION ERROR: 23514: spiexceptions.CheckViolation: new row for relation "bar" violates check constraint "hello" CONTEXT: Traceback (most recent call last): PL/Python function "foo", line 4, in plpy.execute("INSERT INTO bar (a) VALUES (1)") PL/Python function "foo" LOCATION: PLy_elog, plpython.c:4504 ---clip--- Cheers! - Mika 0001-PL-Python-SQL-error-code-pass-through.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers