Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Hi, Peter! On 20.01.2024 01:41, Peter Geoghegan wrote: It is quite likely that there are exactly zero affected out-of-core index AMs. I don't count pgroonga as a counterexample (I don't think that it actually fullfills the definition of a ). Basically, "amcanorder" index AMs more or less promise to be compatible with nbtree, down to having the same strategy numbers. So the idea that I'm going to upset amsearcharray+amcanorder index AM authors is a completely theoretical problem. The planner code evolved with nbtree, hand-in-glove. From the 5bf748b86bc commit message: There is a theoretical risk that removing restrictions on SAOP index paths from the planner will break compatibility with amcanorder-based index AMs maintained as extensions. Such an index AM could have the same limitations around ordered SAOP scans as nbtree had up until now. Adding a pro forma incompatibility item about the issue to the Postgres 17 release notes seems like a good idea. Seems, this commit broke our posted knn_btree patch. [1] If the point from which ORDER BY goes by distance is greater than the elements of ScalarArrayOp, then knn_btree algorithm will give only the first tuple. It sorts the elements of ScalarArrayOp in descending order and starts searching from smaller to larger and always expects that for each element of ScalarArrayOp there will be a separate scan. And now it does not work. Reproduction is described in [2]. Seems it is impossible to solve this problem only from the knn-btree patch side. Could you advise any ways how to deal with this. Would be very grateful. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company [1] https://commitfest.postgresql.org/48/4871/ [2] https://www.postgresql.org/message-id/47adb0b0-6e65-4b40-8d93-20dcecc21395%40postgrespro.ru
Re: Maybe don't process multi xmax in FreezeMultiXactId() if it is already marked as invalid?
On 19.06.2024 21:06, Peter Geoghegan wrote: On Wed, Jun 19, 2024 at 1:39 PM Alvaro Herrera wrote: FWIW I don't think HEAP_XMAX_INVALID as purely a hint. HEAP_XMAX_COMMITTED is a hint, for sure, as is HEAP_XMIN_COMMITTED on its own; but as far as I recall, the INVALID flags must persist once set. Seems we disagree on some pretty fundamental things in this area, then. To resolve this situation seems it is necessary to agree on what is a "hint bit" exactly means and how to use it. For example, in this way: 1) Definition. The "hint bit" if it is set represents presence of the property of some object (e.g. tuple). The value of a hint bit can be derived again at any time. So it is acceptable for a hint bit to be lost during some operations. 2) Purpose. (It has already been defined by Yura Sokolov in one of the previous letters) Some work (e.g CPU + mem usage) must be done to check the property of some object. Checking the hint bit, if it is set, saves this work. So the purpose of the hint bit is optimization. 3) Use. By default code that needs to check some property of the object must firstly check the corresponding hint bit. If hint is set, determine that the property is present. If hint is not set, do the work to check this property of the object and set hint bit if that property is present. Also, non-standard behavior is allowed, when the hint bit is ignored and the work on property check will be performed unconditionally for some reasons. In this case the code must contain a comment with an explanation of this reason. And maybe for clarity, explicitly say that some bit is a hint right in its definition? For instance, use HEAP_XMIN_COMMITTED_HINT instead of HEAP_XMIN_COMMITTED. Remarks and concerns are gratefully welcome. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: psql: fix variable existence tab completion
On 19.07.2024 01:10, Tom Lane wrote: Actually, I think we ought to just reject this change. Debian 10 will be two years past EOL before PG 17 ships. So I don't see a reason to support it in the tests anymore. One of the points of such testing is to expose broken platforms, not mask them. Obviously, if anyone can point to a still-in-support platform with the same bug, that calculus might change. The bug when broken version of libedit want to backslash some symbols (e.g. double quotas, curly braces, the question mark) i only encountered on Debian 10 (buster). If anyone has encountered a similar error on some other system, please share such information. With respect to the other hacks Alexander mentions, maybe we could clean some of those out too? I don't recall what platform we had in mind there, but we've moved our goalposts on what we support pretty far in the last couple years. Agreed that no reason to save workarounds for non-supported systems. Here is the patch that removes fixes for Buster bug mentioned above. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 971184c49cf2c5932bb682ee977a6fef9ba4eba5 Mon Sep 17 00:00:00 2001 From: "Anton A. Melnikov" Date: Sun, 21 Jul 2024 01:37:03 +0300 Subject: [PATCH] Remove workarounds for Debian 10 (Buster) with broken libedit from the 010_tab_completion.pl test. --- src/bin/psql/t/010_tab_completion.pl | 27 --- 1 file changed, 8 insertions(+), 19 deletions(-) diff --git a/src/bin/psql/t/010_tab_completion.pl b/src/bin/psql/t/010_tab_completion.pl index b45c39f0f5..678a0f7a2c 100644 --- a/src/bin/psql/t/010_tab_completion.pl +++ b/src/bin/psql/t/010_tab_completion.pl @@ -167,26 +167,18 @@ check_completion( qr/"mytab123" +"mytab246"/, "offer multiple quoted table choices"); -# note: broken versions of libedit want to backslash the closing quote; -# not much we can do about that -check_completion("2\t", qr/246\\?" /, +check_completion("2\t", qr/246" /, "finish completion of one of multiple quoted table choices"); -# note: broken versions of libedit may leave us in a state where psql -# thinks there's an unclosed double quote, so that we have to use -# clear_line not clear_query here -clear_line(); +clear_query(); # check handling of mixed-case names -# note: broken versions of libedit want to backslash the closing quote; -# not much we can do about that check_completion( "select * from \"mi\t", - qr/"mixedName\\?" /, + qr/"mixedName" /, "complete a mixed-case name"); -# as above, must use clear_line not clear_query here -clear_line(); +clear_query(); # check case folding check_completion("select * from TAB\t", qr/tab1 /, "automatically fold case"); @@ -198,8 +190,7 @@ clear_query(); # differently, so just check that the replacement comes out correctly check_completion("\\DRD\t", qr/drds /, "complete \\DRD to \\drds"); -# broken versions of libedit require clear_line not clear_query here -clear_line(); +clear_query(); # check completion of a schema-qualified name check_completion("select * from pub\t", @@ -261,18 +252,16 @@ check_completion( qr|tab_comp_dir/af\a?ile|, "filename completion with multiple possibilities"); -# broken versions of libedit require clear_line not clear_query here +# here we inside a string literal 'afile*' and wait for a sub-choice or second TAB. So use clear_line(). clear_line(); # COPY requires quoting -# note: broken versions of libedit want to backslash the closing quote; -# not much we can do about that check_completion( "COPY foo FROM tab_comp_dir/some\t", - qr|'tab_comp_dir/somefile\\?' |, + qr|'tab_comp_dir/somefile' |, "quoted filename completion with one possibility"); -clear_line(); +clear_query(); check_completion( "COPY foo FROM tab_comp_dir/af\t", -- 2.45.2
Maybe don't process multi xmax in FreezeMultiXactId() if it is already marked as invalid?
On 14.06.2024 10:45, Anton A. Melnikov wrote: The src/backend/access/heap/README.tuplock says about HEAP_XMAX_INVALID bit that "Any tuple with this bit set does not have a valid value stored in XMAX." Found that FreezeMultiXactId() tries to process such an invalid multi xmax and may looks for an update xid in the pg_multixact for it. Maybe not do this work in FreezeMultiXactId() and exit immediately if the bit HEAP_XMAX_INVALID was already set? Seems it is important to save the check that multi xmax is not behind relminmxid. So saved it and correct README.tuplock accordingly. Would be glad if someone take a look at the patch attached. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 7608078bddf35904427cb1c04497ab0c98e9d111 Mon Sep 17 00:00:00 2001 From: "Anton A. Melnikov" Date: Fri, 14 Jun 2024 10:42:55 +0300 Subject: [PATCH] Don't process multi xmax in the FreezeMultiXactId() if the HEAP_XMAX_INVALID bit was already set. --- src/backend/access/heap/README.tuplock | 1 + src/backend/access/heap/heapam.c | 8 2 files changed, 9 insertions(+) diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock index 6441e8baf0..e198942647 100644 --- a/src/backend/access/heap/README.tuplock +++ b/src/backend/access/heap/README.tuplock @@ -124,6 +124,7 @@ The following infomask bits are applicable: - HEAP_XMAX_INVALID Any tuple with this bit set does not have a valid value stored in XMAX. + Although if it's a multi xmax it must follow relminmxid. - HEAP_XMAX_IS_MULTI This bit is set if the tuple's Xmax is a MultiXactId (as opposed to a diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 82bb9cb33b..d1cba6970d 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6227,6 +6227,14 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask, (errcode(ERRCODE_DATA_CORRUPTED), errmsg_internal("found multixact %u from before relminmxid %u", multi, cutoffs->relminmxid))); + else if (MultiXactIdIsValid(multi) && + (t_infomask & HEAP_XMAX_INVALID)) + { + /* Xmax is already marked as invalid */ + *flags |= FRM_INVALIDATE_XMAX; + pagefrz->freeze_required = true; + return InvalidTransactionId; + } else if (MultiXactIdPrecedes(multi, cutoffs->OldestMxact)) { TransactionId update_xact; -- 2.45.2
Don't process multi xmax in FreezeMultiXactId() if it is already marked as invalid.
Hello! The src/backend/access/heap/README.tuplock says about HEAP_XMAX_INVALID bit that "Any tuple with this bit set does not have a valid value stored in XMAX." Found that FreezeMultiXactId() tries to process such an invalid multi xmax and may looks for an update xid in the pg_multixact for it. Maybe not do this work in FreezeMultiXactId() and exit immediately if the bit HEAP_XMAX_INVALID was already set? For instance, like that: master @@ -6215,6 +6215,15 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask, /* We should only be called in Multis */ Assert(t_infomask & HEAP_XMAX_IS_MULTI); + /* Xmax is already marked as invalid */ + if (MultiXactIdIsValid(multi) && + (t_infomask & HEAP_XMAX_INVALID)) + { + *flags |= FRM_INVALIDATE_XMAX; + pagefrz->freeze_required = true; + return InvalidTransactionId; + } + if (!MultiXactIdIsValid(multi) || HEAP_LOCKED_UPGRADED(t_infomask)) With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: psql: fix variable existence tab completion
Hi, Alexander! On 06.05.2024 13:19, Alexander Korotkov wrote: The patch attached fix the 010_tab_completion.pl test in the same way like [1]. Thank you for the fix. As I get, the fix teaches 010_tab_completion.pl to tolerate the invalid result of tab completion. Do you think we could fix it another way to make the result of tab completion correct? Right now i don't see any straight way to fix this to the correct tab completion. There are several similar cases in this test. E.g., for such a commands: CREATE TABLE "mixedName" (f1 int, f2 text); select * from "mi ; gives with debian 10: postgres=# select * from \"mixedName\" ; resulting in an error. Now there is a similar workaround in the 010_tab_completion.pl with regex: qr/"mixedName\\?" / I think if there were or will be complaints from users about this behavior in Debian 10, then it makes sense to look for more complex solutions that will fix a backslash substitutions. If no such complaints, then it is better to make a workaround in test. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: psql: fix variable existence tab completion
Hello! On 14.03.2024 17:57, Alexander Korotkov wrote: On Sun, Mar 3, 2024 at 5:37 PM Erik Wienhold wrote: On 2024-03-03 03:00 +0100, Steve Chavez wrote: psql has the :{?name} syntax for testing a psql variable existence. But currently doing \echo :{?VERB doesn't trigger tab completion. This patch fixes it. I've also included a TAP test. Thanks. The code looks good, all tests pass, and the tab completion works as expected when testing manually. I'm not sure if Debian 10 is actual for the current master. But, if this is the case, i suggest a patch, since the test will not work under this OS. The thing is that, Debian 10 will backslash curly braces and the question mark and TAB completion will lead to the string like that: \echo :\{\?VERBOSITY\} instead of expected: \echo :{?VERBOSITY} The patch attached fix the 010_tab_completion.pl test in the same way like [1]. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company [1] https://www.postgresql.org/message-id/960764.1643751...@sss.pgh.pa.usFrom 455bec0d785b0f5057fc7e91a5fede458cf8fd36 Mon Sep 17 00:00:00 2001 From: "Anton A. Melnikov" Date: Mon, 6 May 2024 08:40:45 +0300 Subject: [PATCH] Fix variable tab completion for broken libedit --- src/bin/psql/t/010_tab_completion.pl | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/bin/psql/t/010_tab_completion.pl b/src/bin/psql/t/010_tab_completion.pl index b45c39f0f5..358478e935 100644 --- a/src/bin/psql/t/010_tab_completion.pl +++ b/src/bin/psql/t/010_tab_completion.pl @@ -414,12 +414,15 @@ check_completion( clear_query(); # check completion for psql variable test +# note: broken versions of libedit want to backslash curly braces and the question mark; +# not much we can do about that check_completion( "\\echo :{?VERB\t", - qr/:\{\?VERBOSITY} /, + qr/:\\?\{\\?\?VERBOSITY\\?} /, "complete a psql variable test"); -clear_query(); +# broken versions of libedit require clear_line not clear_query here +clear_line(); # check no-completions code path check_completion("blarg \t\t", qr//, "check completion failure path"); -- 2.43.2
Re: Refactoring backend fork+exec code
On 28.04.2024 22:36, Heikki Linnakangas wrote: Peter E noticed and Michael fixed them in commit 768ceeeaa1 already. Didn't check that is already fixed in the current master. Sorry! Thanks for pointing this out! With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Refactoring backend fork+exec code
Hello! Maybe add PGDLLIMPORT to extern bool LoadedSSL; and extern struct ClientSocket *MyClientSocket; definitions in the src/include/postmaster/postmaster.h ? With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Use XLOG_CONTROL_FILE macro everywhere?
On 24.04.2024 12:19, Daniel Gustafsson wrote: On 24 Apr 2024, at 11:13, Anton A. Melnikov wrote: On 24.04.2024 12:02, Peter Eisentraut wrote: On 19.04.24 05:50, Anton A. Melnikov wrote: May be better use this macro everywhere in C code? I don't know. I don't find XLOG_CONTROL_FILE to be a very intuitive proxy for "pg_control". Maybe, but inconsistent usage is also unintuitive. Then maybe replace XLOG_CONTROL_FILE with PG_CONTROL_FILE? The PG_CONTROL_FILE_SIZE macro is already in the code. With the best regards, XLOG_CONTROL_FILE is close to two decades old so there might be extensions using it (though the risk might be slim), perhaps using offering it as as well as backwards-compatability is warranted should we introduce a new name? To ensure backward compatibility we can save the old macro like this: #define XLOG_CONTROL_FILE "global/pg_control" #define PG_CONTROL_FILE XLOG_CONTROL_FILE With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Use XLOG_CONTROL_FILE macro everywhere?
On 24.04.2024 12:02, Peter Eisentraut wrote: On 19.04.24 05:50, Anton A. Melnikov wrote: May be better use this macro everywhere in C code? I don't know. I don't find XLOG_CONTROL_FILE to be a very intuitive proxy for "pg_control". Then maybe replace XLOG_CONTROL_FILE with PG_CONTROL_FILE? The PG_CONTROL_FILE_SIZE macro is already in the code. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Use XLOG_CONTROL_FILE macro everywhere?
On 20.04.2024 09:36, Daniel Gustafsson wrote: Anton: please register this patch in the Open commitfest to ensure it's not forgotten about. Done. Daniel and Michael thank you! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Use XLOG_CONTROL_FILE macro everywhere?
Hello! There is a macro XLOG_CONTROL_FILE for control file name defined in access/xlog_internal.h And there are some places in code where this macro is used like here https://github.com/postgres/postgres/blob/84db9a0eb10dd1dbee6db509c0e427fa237177dc/src/bin/pg_resetwal/pg_resetwal.c#L588 or here https://github.com/postgres/postgres/blob/84db9a0eb10dd1dbee6db509c0e427fa237177dc/src/common/controldata_utils.c#L214 But there are some other places where the control file name is used as text string directly. May be better use this macro everywhere in C code? The patch attached tries to do this. Would be glad if take a look on it. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 2692683142c98572114c32f696b55c6a642dd3e9 Mon Sep 17 00:00:00 2001 From: "Anton A. Melnikov" Date: Fri, 19 Apr 2024 06:12:44 +0300 Subject: [PATCH] Use XLOG_CONTROL_FILE macro everywhere in C code --- src/backend/backup/basebackup.c | 2 +- src/backend/postmaster/postmaster.c | 3 ++- src/bin/pg_combinebackup/pg_combinebackup.c | 5 +++-- src/bin/pg_controldata/pg_controldata.c | 2 +- src/bin/pg_rewind/filemap.c | 3 ++- src/bin/pg_rewind/pg_rewind.c | 8 src/bin/pg_upgrade/controldata.c| 11 ++- src/bin/pg_verifybackup/pg_verifybackup.c | 3 ++- src/common/controldata_utils.c | 2 +- 9 files changed, 22 insertions(+), 17 deletions(-) diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c index 9a2bf59e84..c5a1e18104 100644 --- a/src/backend/backup/basebackup.c +++ b/src/backend/backup/basebackup.c @@ -1347,7 +1347,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly, snprintf(pathbuf, sizeof(pathbuf), "%s/%s", path, de->d_name); /* Skip pg_control here to back up it last */ - if (strcmp(pathbuf, "./global/pg_control") == 0) + if (strcmp(pathbuf, "./" XLOG_CONTROL_FILE) == 0) continue; if (lstat(pathbuf, ) != 0) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 7f3170a8f0..02b36caea7 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -90,6 +90,7 @@ #endif #include "access/xlog.h" +#include "access/xlog_internal.h" #include "access/xlogrecovery.h" #include "common/file_perm.h" #include "common/file_utils.h" @@ -1489,7 +1490,7 @@ checkControlFile(void) char path[MAXPGPATH]; FILE *fp; - snprintf(path, sizeof(path), "%s/global/pg_control", DataDir); + snprintf(path, sizeof(path), "%s/%s", DataDir, XLOG_CONTROL_FILE); fp = AllocateFile(path, PG_BINARY_R); if (fp == NULL) diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c index b26c532445..16448e78b8 100644 --- a/src/bin/pg_combinebackup/pg_combinebackup.c +++ b/src/bin/pg_combinebackup/pg_combinebackup.c @@ -16,6 +16,7 @@ #include #include +#include "access/xlog_internal.h" #include "backup_label.h" #include "common/blkreftable.h" #include "common/checksum_helper.h" @@ -284,7 +285,7 @@ main(int argc, char *argv[]) { char *controlpath; - controlpath = psprintf("%s/%s", prior_backup_dirs[i], "global/pg_control"); + controlpath = psprintf("%s/%s", prior_backup_dirs[i], XLOG_CONTROL_FILE); pg_fatal("%s: manifest system identifier is %llu, but control file has %llu", controlpath, @@ -591,7 +592,7 @@ check_control_files(int n_backups, char **backup_dirs) bool crc_ok; char *controlpath; - controlpath = psprintf("%s/%s", backup_dirs[i], "global/pg_control"); + controlpath = psprintf("%s/%s", backup_dirs[i], XLOG_CONTROL_FILE); pg_log_debug("reading \"%s\"", controlpath); control_file = get_controlfile_by_exact_path(controlpath, _ok); diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index 93e0837947..1fe6618e7f 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -1,7 +1,7 @@ /* * pg_controldata * - * reads the data from $PGDATA/global/pg_control + * reads the data from $PGDATA/ * * copyright (c) Oliver Elphick , 2001; * license: BSD diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c index 4458324c9d..6224e94d09 100644 --- a/src/bin/pg_rewind/filemap.c +++ b/src/bin/pg_rewind/filemap.c @@ -26,6 +26,7 @@ #include #include +#include "access/xlog_internal.h" #include "catalog/pg_tablespace_d.h" #include "common/file_utils.h" #include "common/hashfn_unstable.h" @@ -643,7 +644,7 @@ decide_file_action(file_entry_t *en
Re: effective_multixact_freeze_max_age issue
Hi, Peter! Sorry! For a some time i forgot about this thread and forgot to thank you for your answer. Thereby now its clear for me that this patch allows the autovacuum to win some time between OldestXmin and nextXID that could not be used before. I think, it maybe especially useful for high-load applications. Digging depeer, i found some inconsistency between current docs and the real behavior and would like to bring this to your attention. Now the doc says that an aggressive vacuum scan will occur for any table whose multixact-age is greater than autovacuum_multixact_freeze_max_age. But really vacuum_get_cutoffs() will return true when multixact-age is greater or equal than autovacuum_multixact_freeze_max_age if relminmxid is not equal to one. If relminmxid is equal to one then vacuum_get_cutoffs() return true even multixact-age is less by one then autovacuum_multixact_freeze_max_age. For instance, if relminmxid = 1 and multixact_freeze_table_age = 100, vacuum will start to be aggressive from the age of 99 (when ReadNextMultiXactId() = 100). The patch attached attempts to fix this and tries to use semantic like in the doc. The similar fix was made for common xacts too. Additional check for relminxid allows to disable agressive scan at all if it is invalid. But i'm not sure if such a check is needed here. Please take it into account. With kindly regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 40ea108a5be00659fe6799897035e1f9abf122a8 Author: Anton A. Melnikov Date: Fri Apr 5 08:08:56 2024 +0300 Make vacuum scan aggressive for any table whose xact/multixact-age is strictly greater than xact/multixact_freeze_table_age. diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index b589279d49f..e397a394a88 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1083,6 +1083,8 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, MultiXactId nextMXID, safeOldestMxact, aggressiveMXIDCutoff; + int32 xact_age; + int32 multixact_age; /* Use mutable copies of freeze age parameters */ freeze_min_age = params->freeze_min_age; @@ -1093,6 +1095,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, /* Set pg_class fields in cutoffs */ cutoffs->relfrozenxid = rel->rd_rel->relfrozenxid; cutoffs->relminmxid = rel->rd_rel->relminmxid; + /* ??? Should we do this check here? */ + /* Avoid agressive scan if relminmxid is invalid. */ + if (!MultiXactIdIsValid(cutoffs->relminmxid)) + cutoffs->relminmxid = INT_MAX; /* * Acquire OldestXmin. @@ -1200,8 +1206,8 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, aggressiveXIDCutoff = nextXID - freeze_table_age; if (!TransactionIdIsNormal(aggressiveXIDCutoff)) aggressiveXIDCutoff = FirstNormalTransactionId; - if (TransactionIdPrecedesOrEquals(rel->rd_rel->relfrozenxid, - aggressiveXIDCutoff)) + xact_age = (int32) (nextMXID - cutoffs->relfrozenxid); + if (xact_age > freeze_table_age) return true; /* @@ -1221,8 +1227,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, aggressiveMXIDCutoff = nextMXID - multixact_freeze_table_age; if (aggressiveMXIDCutoff < FirstMultiXactId) aggressiveMXIDCutoff = FirstMultiXactId; - if (MultiXactIdPrecedesOrEquals(rel->rd_rel->relminmxid, - aggressiveMXIDCutoff)) + + multixact_age = (int32) (nextMXID - cutoffs->relminmxid); + if (multixact_age > multixact_freeze_table_age) return true; /* Non-aggressive VACUUM */
Re: [PATCH] kNN for btree
Hi, Andrey! On 31.03.2024 12:22, Andrey M. Borodin wrote: On 15 Jan 2024, at 13:11, Anton A. Melnikov wrote: If there are any ideas pro and contra would be glad to discuss them. Hi, Anton! This is kind of ancient thread. I've marked CF entry [0] as "Needs review" and you as an author (seems like you are going to be a point of correspondence on this feature). That's right, i would like to bring the work on this feature to a positive result. First of all, let me share a simple test that allows one to estimate the effect of applying this patch and, i hope, can be considered as a criterion for future versions. For all the tests below, one should set the following settings: set enable_seqscan to false; set enable_indexscan to true; set enable_bitmapscan to false; set enable_indexonlyscan to true; set max_parallel_workers_per_gather = 0; To carry out the test, one can use the table "events" mentioned in the first message of this thread, linked here [1]. psql -f events.dump Then perform a query like that: explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT 10; When using the existing btree_gist extension and preliminary commands executing: create extension btree_gist; CREATE INDEX event_date_idx ON events USING gist (date); the test query gives: postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT 10; QUERY PLAN -- Limit (actual time=0.759..102.992 rows=10 loops=1) -> Index Only Scan using event_date_idx on events (actual time=0.757..97.021 rows=10 loops=1) Order By: (date <-> '1957-10-04'::date) Heap Fetches: 0 Planning Time: 0.504 ms Execution Time: 108.311 ms Average value on my PC was 107+-1 ms. When using an existing patch from [1] and creating a btree index: CREATE INDEX event_date_idx ON events USING btree (date); instead of btree_gist one, the test query gives: postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT 10; QUERY PLAN -- Limit (actual time=0.120..48.817 rows=10 loops=1) -> Index Only Scan using event_date_idx on events (actual time=0.117..42.610 rows=10 loops=1) Order By: (date <-> '1957-10-04'::date) Heap Fetches: 0 Planning Time: 0.487 ms Execution Time: 54.463 ms 55+-1 ms on average. The execution time is reduced by ~2 times. So the effect is obvious but the implementation problems are reasonable too. On 15.01.2024 11:11, Anton A. Melnikov wrote: On 16.03.2020 16:17, Alexander Korotkov wrote: After another try to polish this patch I figured out that the way it's implemented is unnatural. I see the two reasonable ways to implement knn for B-tree, but current implementation matches none of them. 1) Implement knn as two simultaneous scans over B-tree: forward and backward. It's similar to what current patchset does. But putting this logic to B-tree seems artificial. What B-tree does here is still unidirectional scan. On top of that we merge results of two unidirectional scans. The appropriate place to do this high-level work is IndexScan node or even Optimizer/Executor (Merge node over to IndexScan nodes), but not B-tree itself. 2) Implement arbitrary scans in B-tree using priority queue like GiST and SP-GiST do. That would lead to much better support for KNN. We can end up in supporting interesting cases like "ORDER BY col1 DESC, col2 <> val1, col2 ASC" or something. But that's requires way more hacking in B-tree core. At first i'm going to implement p.1). I think it's preferable for now because it seems easier and faster to get a working version. I was wrong here. Firstly, this variant turned out to be not so easy and fast, and secondly, when i received the desired query plan, i was not happy with the results: In the case of btree_gist, splitting the query into two scans at the optimizer level and adding MergeAppend on the top of it resulted in a ~13% slowdown in query execution. The average time became ~121 ms. Limit (actual time=1.205..117.689 rows=10 loops=1) -> Merge Append (actual time=1.202..112.260 rows=10 loops=1) Sort Key: ((events.date <-> '1957-10-04'::date)) -> Index Only Scan using event_date_idx on events (actual time=0.713..43.372 rows=42585 loops=1) Index Cond: (date < '1957-10-04'::date) Order By: (date <-> '1957-10-04'::date)
Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)
On 22.03.2024 11:02, Pavel Borisov wrote: Attached is a fix. Thanks! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)
Hi! Maybe _bt_readpage(scan, dir, start, true) needed at this line: https://github.com/postgres/postgres/blob/b4080fa3dcf6c6359e542169e0e81a0662c53ba8/src/backend/access/nbtree/nbtsearch.c#L2501 ? Do we really need to try prechecking the continuescan flag here? And the current "false" in the last arg does not match the previous code before 06b10f80ba and the current comment above. Would be very grateful for clarification. With the best regards! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 13.03.2024 10:41, Anton A. Melnikov wrote: Here is a version updated for the current master. During patch updating i mistakenly added double counting of deallocatated blocks. That's why the tests in the patch tester failed. Fixed it and squashed fix 0002 with 0001. Here is fixed version. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 7a0925a38acfb9a945087318a5d91fae4680db0e Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 26 Dec 2023 17:54:40 +0100 Subject: [PATCH 1/2] Add tracking of backend memory allocated Add tracking of backend memory allocated in total and by allocation type (aset, dsm, generation, slab) by process. allocated_bytes tracks the current bytes of memory allocated to the backend process. aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes and slab_allocated_bytes track the allocation by type for the backend process. They are updated for the process as memory is malloc'd/freed. Memory allocated to items on the freelist is included. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. DSM allocations that are not destroyed by the creating process prior to it's exit are considered long lived and are tracked in a global counter global_dsm_allocated_bytes. We limit the floor of allocation counters to zero. Created views pg_stat_global_memory_allocation and pg_stat_memory_allocation for access to these trackers. --- doc/src/sgml/monitoring.sgml| 246 src/backend/catalog/system_views.sql| 34 +++ src/backend/storage/ipc/dsm.c | 11 +- src/backend/storage/ipc/dsm_impl.c | 78 +++ src/backend/storage/lmgr/proc.c | 1 + src/backend/utils/activity/backend_status.c | 114 + src/backend/utils/adt/pgstatfuncs.c | 84 +++ src/backend/utils/init/miscinit.c | 3 + src/backend/utils/mmgr/aset.c | 17 ++ src/backend/utils/mmgr/generation.c | 13 ++ src/backend/utils/mmgr/slab.c | 22 ++ src/include/catalog/pg_proc.dat | 17 ++ src/include/storage/proc.h | 2 + src/include/utils/backend_status.h | 144 +++- src/test/regress/expected/rules.out | 27 +++ src/test/regress/expected/stats.out | 36 +++ src/test/regress/sql/stats.sql | 20 ++ 17 files changed, 867 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 8736eac284..41d788be45 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4599,6 +4599,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage + + pg_stat_memory_allocation + + + pg_stat_memory_allocation + + + + The pg_stat_memory_allocation view will have one + row per server process, showing information related to the current memory + allocation of that process in total and by allocator type. Due to the + dynamic nature of memory allocations the allocated bytes values may not be + exact but should be sufficient for the intended purposes. Dynamic shared + memory allocations are included only in the value displayed for the backend + that created them, they are not included in the value for backends that are + attached to them to avoid double counting. Use + pg_size_pretty described in +to make these values more easily + readable. + + + + pg_stat_memory_allocation View + + + + + Column Type + + + Description + + + + + + + + datid oid + + + OID of the database this backend is connected to + + + + + + pid integer + + + Process ID of this backend + + + + + + allocated_bytes bigint + + + Memory currently allocated to this backend in bytes. This is the balance + of bytes allocated and freed by this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + + + + aset_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the allocation + set allocator. + + + + + + dsm_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the dynamic + shared memory allocator. Upon process exit, dsm allocations that have + not been freed are considered long lived and added to + global_dsm_allocated_bytes
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
On 14.03.2024 03:19, Alexander Korotkov wrote: Pushed. Thanks! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 12.03.2024 16:30, Aleksander Alekseev wrote: Just wanted to let you know that v20231226 doesn't apply. The patch needs love from somebody interested in it. Thanks for pointing to this! Here is a version updated for the current master. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom e31012217fb63f57fe16d7232fec27bb6cd45597 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 26 Dec 2023 17:54:40 +0100 Subject: [PATCH 1/3] Add tracking of backend memory allocated Add tracking of backend memory allocated in total and by allocation type (aset, dsm, generation, slab) by process. allocated_bytes tracks the current bytes of memory allocated to the backend process. aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes and slab_allocated_bytes track the allocation by type for the backend process. They are updated for the process as memory is malloc'd/freed. Memory allocated to items on the freelist is included. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. DSM allocations that are not destroyed by the creating process prior to it's exit are considered long lived and are tracked in a global counter global_dsm_allocated_bytes. We limit the floor of allocation counters to zero. Created views pg_stat_global_memory_allocation and pg_stat_memory_allocation for access to these trackers. --- doc/src/sgml/monitoring.sgml| 246 src/backend/catalog/system_views.sql| 34 +++ src/backend/storage/ipc/dsm.c | 11 +- src/backend/storage/ipc/dsm_impl.c | 78 +++ src/backend/storage/lmgr/proc.c | 1 + src/backend/utils/activity/backend_status.c | 114 + src/backend/utils/adt/pgstatfuncs.c | 84 +++ src/backend/utils/init/miscinit.c | 3 + src/backend/utils/mmgr/aset.c | 17 ++ src/backend/utils/mmgr/generation.c | 14 ++ src/backend/utils/mmgr/slab.c | 22 ++ src/include/catalog/pg_proc.dat | 17 ++ src/include/storage/proc.h | 2 + src/include/utils/backend_status.h | 144 +++- src/test/regress/expected/rules.out | 27 +++ src/test/regress/expected/stats.out | 36 +++ src/test/regress/sql/stats.sql | 20 ++ 17 files changed, 868 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 8aca08140e..5f827fe0b0 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4598,6 +4598,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage + + pg_stat_memory_allocation + + + pg_stat_memory_allocation + + + + The pg_stat_memory_allocation view will have one + row per server process, showing information related to the current memory + allocation of that process in total and by allocator type. Due to the + dynamic nature of memory allocations the allocated bytes values may not be + exact but should be sufficient for the intended purposes. Dynamic shared + memory allocations are included only in the value displayed for the backend + that created them, they are not included in the value for backends that are + attached to them to avoid double counting. Use + pg_size_pretty described in +to make these values more easily + readable. + + + + pg_stat_memory_allocation View + + + + + Column Type + + + Description + + + + + + + + datid oid + + + OID of the database this backend is connected to + + + + + + pid integer + + + Process ID of this backend + + + + + + allocated_bytes bigint + + + Memory currently allocated to this backend in bytes. This is the balance + of bytes allocated and freed by this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + + + + aset_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the allocation + set allocator. + + + + + + dsm_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the dynamic + shared memory allocator. Upon process exit, dsm allocations that have + not been freed are considered long lived and added to + global_dsm_allocated_bytes found
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
On 11.03.2024 03:39, Alexander Korotkov wrote: Now that we distinguish stats for checkpoints and restartpoints, we need to update the docs. Please, check the patch attached. Maybe bring the pg_stat_get_checkpointer_buffers_written() description in consistent with these changes? Like that: --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5740 +5740 @@ - descr => 'statistics: number of buffers written by the checkpointer', + descr => 'statistics: number of buffers written during checkpoints and restartpoints', And after i took a fresh look at this patch i noticed a simple way to extract write_time and sync_time counters for restartpoints too. What do you think, is there a sense to do this? With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: ResourceOwner refactoring
On 16.01.2024 14:54, Heikki Linnakangas wrote: Fixed, thanks. mark_pgdllimport.pl also highlighted two new variables in walsummarizer.h, fixed those too. Thanks! Have a nice day! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: ResourceOwner refactoring
Hello! Found possibly missing PGDLLIMPORT in the definition of the extern const ResourceOwnerDesc buffer_io_resowner_desc; and extern const ResourceOwnerDesc buffer_pin_resowner_desc; callbacks in the src/include/storage/buf_internals.h Please take a look on it. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
On 25.12.2023 02:38, Alexander Korotkov wrote: Pushed! Thanks a lot! With the best regards! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Thanks for remarks! On 28.11.2023 21:34, Alexander Korotkov wrote: After examining the second patch ("v2-0001-Add-restartpoint-stats.patch"), it appears that adding additional statistics as outlined in the patch is the most suitable approach to address the concerns raised. This solution provides more visibility into the system's behavior without altering its core mechanics. Agreed. I left only this variant of the patch and rework it due to commit 96f05261. So the new counters is in the pg_stat_checkpointer view now. Please see the v3-0001-add-restartpoints-stats.patch attached. However, it's essential that this additional functionality is accompanied by comprehensive documentation to ensure clear understanding and ease of use by the PostgreSQL community. Please consider expanding the documentation to include detailed explanations of the new statistics and their implications in various scenarios. In the separate v3-0002-doc-for-restartpoints-stats.patch i added the definitions of the new counters into the "28.2.15. pg_stat_checkpointer" section and explanation of them with examples into the "30.5.WAL Configuration" one. Would be glad for any comments and and concerns. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 5711c09dbfbe02586a247a98e0ae41cd71a221a3 Author: Anton A. Melnikov Date: Sun Dec 3 12:49:11 2023 +0300 Add statistic about restartpoints into pg_stat_checkpointer diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 11d18ed9dd..058fc47c91 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1141,6 +1141,9 @@ CREATE VIEW pg_stat_checkpointer AS SELECT pg_stat_get_checkpointer_num_timed() AS num_timed, pg_stat_get_checkpointer_num_requested() AS num_requested, +pg_stat_get_checkpointer_restartpoints_timed() AS restartpoints_timed, +pg_stat_get_checkpointer_restartpoints_requested() AS restartpoints_req, +pg_stat_get_checkpointer_restartpoints_performed() AS restartpoints_done, pg_stat_get_checkpointer_write_time() AS write_time, pg_stat_get_checkpointer_sync_time() AS sync_time, pg_stat_get_checkpointer_buffers_written() AS buffers_written, diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index dc2da5a2cd..67ecb177e7 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -340,6 +340,8 @@ CheckpointerMain(void) pg_time_t now; int elapsed_secs; int cur_timeout; + bool chkpt_or_rstpt_requested = false; + bool chkpt_or_rstpt_timed = false; /* Clear any already-pending wakeups */ ResetLatch(MyLatch); @@ -358,7 +360,7 @@ CheckpointerMain(void) if (((volatile CheckpointerShmemStruct *) CheckpointerShmem)->ckpt_flags) { do_checkpoint = true; - PendingCheckpointerStats.num_requested++; + chkpt_or_rstpt_requested = true; } /* @@ -372,7 +374,7 @@ CheckpointerMain(void) if (elapsed_secs >= CheckPointTimeout) { if (!do_checkpoint) -PendingCheckpointerStats.num_timed++; +chkpt_or_rstpt_timed = true; do_checkpoint = true; flags |= CHECKPOINT_CAUSE_TIME; } @@ -408,6 +410,24 @@ CheckpointerMain(void) if (flags & CHECKPOINT_END_OF_RECOVERY) do_restartpoint = false; + if (chkpt_or_rstpt_timed) + { +chkpt_or_rstpt_timed = false; +if (do_restartpoint) + PendingCheckpointerStats.restartpoints_timed++; +else + PendingCheckpointerStats.num_timed++; + } + + if (chkpt_or_rstpt_requested) + { +chkpt_or_rstpt_requested = false; +if (do_restartpoint) + PendingCheckpointerStats.restartpoints_requested++; +else + PendingCheckpointerStats.num_requested++; + } + /* * We will warn if (a) too soon since last checkpoint (whatever * caused it) and (b) somebody set the CHECKPOINT_CAUSE_XLOG flag @@ -471,6 +491,9 @@ CheckpointerMain(void) * checkpoints happen at a predictable spacing. */ last_checkpoint_time = now; + +if (do_restartpoint) + PendingCheckpointerStats.restartpoints_performed++; } else { diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c index 301a0bc7bd..6ee258f240 100644 --- a/src/backend/utils/activity/pgstat_checkpointer.c +++ b/src/backend/utils/activity/pgstat_checkpointer.c @@ -49,6 +49,9 @@ pgstat_report_checkpointer(void) #define CHECKPOINTER_ACC(fld) stats_shmem->stats.fld += PendingCheckpointerStats.fld CHECKPOINTER_ACC(num_timed); CHECKPOINTER_ACC(num_requested); + CHECKPOINTER_ACC(restartpoints_timed); + CHECKPOINTER_ACC(restartpoints_requested); + CHECKPOINTER_ACC(restartpoints_performed); CHECKPOINTER_ACC(write_time); CHECKPOINTER_ACC(sync_time); CHECKPOI
Re: Should timezone be inherited from template database?
On 26.11.2023 18:53, David G. Johnston wrote: https://www.postgresql.org/docs/current/sql-createdatabase.html <https://www.postgresql.org/docs/current/sql-createdatabase.html> Database-level configuration parameters (set via ALTER DATABASE) and database-level permissions (set via GRANT) are not copied from the template database. Clear. Thank you very much! With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Should timezone be inherited from template database?
Hello! Found that if i set a specific time zone for a template database, it will not be inherited in the database created from that template. psql (17devel) Type "help" for help. postgres=# select now(); now --- 2023-11-26 17:24:58.242086+03 (1 row) postgres=# ALTER DATABASE template1 SET TimeZone = 'UTC'; ALTER DATABASE postgres=# \c template1 You are now connected to database "template1" as user "postgres". template1=# select now(); now --- 2023-11-26 14:26:09.291082+00 (1 row) template1=# CREATE DATABASE test; CREATE DATABASE template1=# \c test You are now connected to database "test" as user "postgres". test=# select now(); now --- 2023-11-26 17:29:05.487984+03 (1 row) Could you clarify please. Is this normal, predictable behavior? Would be very grateful! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Some performance degradation in REL_16 vs REL_15
On 30.10.2023 22:51, Andres Freund wrote: There's really no point in comparing peformance with assertions enabled (leaving aside assertions that cause extreme performance difference, making development harder). We very well might have added assertions making things more expensive, without affecting performance in optimized/non-assert builds. Thanks for advice! I repeated measurements on my pc without asserts and CFLAGS="-O2". Also i reduced the number of clients to -c6 to leave a reserve of two cores from my 8-core cpu and used -j6 accordingly. The results were similar: on my pc REL_10_STABLE(c18c12c9) was faster than REL_16_STABLE(07494a0d) but the effect became weaker: REL_10_STABLE gives ~965+-15 TPS(+-2%) while REL_16_STABLE gives ~920+-30 TPS(+-3%) in the test: pgbench -s8 -c6 -T20 -j6 So 10 is faster than 16 by ~5%. (see raw-my-pc.txt attached for the raw data) Then, thanks to my colleagues, i carried out similar measurements on the more powerful 24-core standalone server. The REL_10_STABLE gives 8260+-100 TPS(+-1%) while REL_16_STABLE gives 8580+-90 TPS(+-1%) in the same test: pgbench -s8 -c6 -T20 -j6 The test gave an opposite result! On that server the 16 is faster than 10 by ~4%. When i scaled the test on server to get the same reserve of two cores, the results became like this: REL_10_STABLE gives ~16000+-300 TPS(+-2%) while REL_16_STABLE gives ~18500+-200 TPS(+-1%) in the scaled test: pgbench -s24 -c22 -T20 -j22 Here the difference is more noticeable: 16 is faster than 10 by ~15%. (raw-server.txt) The configure options and test scripts on my pc and server were the same: export CFLAGS="-O2" ./configure --enable-debug --with-perl --with-icu --enable-depend --enable-tap-tests #reinstall #reinitdb #create database bench for ((i=0; i<100; i++)); do pgbench -U postgres -i -s8 bench> /dev/null 2>&1; psql -U postgres -d bench -c "checkpoint"; RES=$(pgbench -U postgres -c6 -T20 -j6 bench; Configurations: my pc: 8-core AMD Ryzen 7 4700U @ 1.4GHz, 64GB RAM, NVMe M.2 SSD drive. Linux 5.15.0-88-generic #98~20.04.1-Ubuntu SMP Mon Oct 9 16:43:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux server: 2x 12-hyperthreaded cores Intel(R) Xeon(R) CPU X5675 @ 3.07GHz, 24GB RAM, RAID from SSD drives. Linux 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux I can't understand why i get the opposite results on my pc and on the server. It is clear that the absolute TPS values will be different for various configurations. This is normal. But differences? Is it unlikely that some kind of reference configuration is needed to accurately measure the difference in performance. Probably something wrong with my pc, but now i can not figure out what's wrong. Would be very grateful for any advice or comments to clarify this problem. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company REL_10_STABLE c18c12c983a84d55e58b176969782c7ffac3272b pgbench -s8 -c6 -T20 -j6 940.47198 954.621585 902.319686 965.387517 959.44536 970.882218 922.012141 969.642272 964.549628 935.639076 958.835093 976.912892 975.618375 960.599515 981.900039 973.34447 964.563699 960.321335 962.643262 975.631214 971.78315 965.226256 961.106572 968.520002 973.825485 978.49579 963.863368 973.906058 966.676175 965.186708 954.572371 977.620229 981.419347 962.751969 963.676599 967.966257 974.68403 955.342462 957.832817 984.065968 972.364891 977.489316 957.352265 969.463156 974.320994 949.679765 969.081674 963.190554 962.394456 966.84177 975.840044 954.471689 977.764019 968.67597 963.203923 964.752374 965.957151 979.17749 915.412491 975.120789 962.105916 980.343235 957.180492 953.552183 979.783099 967.906392 966.926945 962.962301 965.53471 971.030289 954.21045 961.266889 973.367193 956.736464 980.317352 911.188865 979.274233 980.267316 982.029926 977.731543 937.327052 978.161778 978.575841 962.661776 914.896072 966.902901 973.539272 980.418576 966.073472 963.196341 962.718863 977.062467 958.303102 959.937588 959.52382 934.876632 971.899844 979.71 964.154208 960.051284 REL_16_STABLE 07494a0df9a66219e5f1029de47ecabc34c9cb55 pgbench -s8 -c6 -T20 -j6 952.061203 905.964458 921.009294 921.970342 924.810464 935.988344 917.110599 925.62075 933.423024 923.445651 932.740532 927.994569 913.773152 922.955946 917.680486 923.145074 925.133017 922.36253 907.656249 927.980182 924.249294 933.355461 923.359649 919.694726 923.178731 929.250348 921.643735 916.546247 930.960814 913.333819 773.157522 945.293028 924.839608 932.228796 912.846096 924.01411 924.341422 909.823505 882.105606 920.337305 930.297982 909.238148 880.839643 939.582115 927.263785 927.921499 932.897521 931.77316 943.261293 853.433421 921.813303 916.463003 919.652647 914.662188 912.137913 923.279822 922.967526 936.344334 946.281347 801.718759 950.571673 928.845848 888.181388 885.603875 939.763546 896.841216 934.904546 929.369005 884.065433 874.953048 933.411683
Re: 003_check_guc.pl crashes if some extensions were loaded.
On 02.11.2023 01:53, Michael Paquier wrote:> On Thu, Nov 02, 2023 at 12:28:05AM +0300, Anton A. Melnikov wrote: Found that src/test/modules/test_misc/t/003_check_guc.pl will crash if an extension that adds own GUCs was loaded into memory. So it is now impossible to run a check-world with loaded extension libraries. Right. That's annoying, so let's fix it. Thanks! On 02.11.2023 02:29, Tom Lane wrote: Michael Paquier writes: Wouldn't it be better to add a qual as of "category <> 'Customized Options'"? +1, seems like a cleaner answer. Also agreed. That is a better variant! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
003_check_guc.pl crashes if some extensions were loaded.
Hello! Found that src/test/modules/test_misc/t/003_check_guc.pl will crash if an extension that adds own GUCs was loaded into memory. So it is now impossible to run a check-world with loaded extension libraries. Reproduction: cd src/test/modules/test_misc export EXTRA_INSTALL="contrib/pg_stat_statements" export TEMP_CONFIG=$(pwd)/pg_stat_statements_temp.conf echo -e "shared_preload_libraries = 'pg_stat_statements'" > $TEMP_CONFIG echo "compute_query_id = 'regress'" >> $TEMP_CONFIG make check PROVE_TESTS='t/003_check_guc.pl' # +++ tap check in src/test/modules/test_misc +++ t/003_check_guc.pl .. 1/? # Failed test 'no parameters missing from postgresql.conf.sample' # at t/003_check_guc.pl line 81. # got: '5' # expected: '0' # Looks like you failed 1 test of 3. Maybe exclude such GUCs from this test? For instance, like that: --- a/src/test/modules/test_misc/t/003_check_guc.pl +++ b/src/test/modules/test_misc/t/003_check_guc.pl @@ -19,7 +19,7 @@ my $all_params = $node->safe_psql( "SELECT name FROM pg_settings WHERE NOT 'NOT_IN_SAMPLE' = ANY (pg_settings_get_flags(name)) AND - name <> 'config_file' + name <> 'config_file' AND name NOT LIKE '%.%' ORDER BY 1"); With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 91c84d50ba4b1c75749e4c160e1d4a25ca684fda Author: Anton A. Melnikov Date: Wed Nov 1 23:58:19 2023 +0300 Exclude extensions' GUCs from the 003_check_guc.pl diff --git a/src/test/modules/test_misc/t/003_check_guc.pl b/src/test/modules/test_misc/t/003_check_guc.pl index 4fd6d03b9e..63fcd754de 100644 --- a/src/test/modules/test_misc/t/003_check_guc.pl +++ b/src/test/modules/test_misc/t/003_check_guc.pl @@ -19,7 +19,7 @@ my $all_params = $node->safe_psql( "SELECT name FROM pg_settings WHERE NOT 'NOT_IN_SAMPLE' = ANY (pg_settings_get_flags(name)) AND - name <> 'config_file' + name <> 'config_file' AND name NOT LIKE '%.%' ORDER BY 1"); # Note the lower-case conversion, for consistency. my @all_params_array = split("\n", lc($all_params));
Re: remaining sql/json patches
On 17.10.2023 07:02, Amit Langote wrote: One thing jian he missed during the debugging is that ExecEvalJsonExprCoersion() receives the EMPTY ARRAY value via *op->resvalue/resnull, set by ExecEvalJsonExprBehavior(), because that's the ON EMPTY behavior specified in the constraint. The bug was that the code in ExecEvalJsonExprCoercion() failed to set val_string to that value ("[]") before passing to InputFunctionCallSafe(), so the latter would assume the input is NULL. Thank a lot for this remark! I tried to dig to the transformJsonOutput() to fix it earlier at the analyze stage, but it looks like a rather hard way. Maybe simple in accordance with you note remove the second condition from this line: if (jb && JB_ROOT_IS_SCALAR(jb)) ? There is a simplified reproduction before such a fix: postgres=# select JSON_QUERY(jsonb '[]', '$' RETURNING char(5) OMIT QUOTES EMPTY ON EMPTY); server closed the connection unexpectedly This probably means the server terminated abnormally after: postgres=# select JSON_QUERY(jsonb '[]', '$' RETURNING char(5) OMIT QUOTES EMPTY ON EMPTY); json_query [] (1 row) And at the moment i havn't found any side effects of that fix. Please point me if i'm missing something. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: remaining sql/json patches
Hello! On 16.10.2023 15:49, jian he wrote: add the following code after ExecEvalJsonExprCoercion if (!InputFunctionCallSafe(...) works, but seems like a hack. if (!val_string) { *op->resnull = true; *op->resvalue = (Datum) 0; } It seems the constraint should work here: After CREATE TABLE test ( js text, i int, x jsonb DEFAULT JSON_QUERY(jsonb '[1,2]', '$[*]' WITH WRAPPER) CONSTRAINT test_constraint CHECK (JSON_QUERY(js::jsonb, '$.a' RETURNING char(5) OMIT QUOTES EMPTY ARRAY ON EMPTY) > 'a') ); INSERT INTO test_jsonb_constraints VALUES ('[]'); one expected to see an error like that: ERROR: new row for relation "test" violates check constraint "test_constraint" DETAIL: Failing row contains ([], null, [1, 2]). not "INSERT 0 1" With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Some performance degradation in REL_16 vs REL_15
On 13.10.2023 05:05, Andres Freund wrote: Could you provide a bit more details about how you ran the benchmark? The reason I am asking is that ~330 TPS is pretty slow for -c20. Even on spinning rust and using the default settings, I get considerably higher results. Oh - I do get results closer to yours if I use pgbench scale 1, causing a lot of row level contention. What scale did you use? I use default scale of 1. And run the command sequence: $pgbench -i bench $sleep 1 $pgbench -c20 -T10 -j8 in a loop to get similar initial conditions for every "pgbench -c20 -T10 -j8" run. Thanks for your interest! With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Some performance degradation in REL_16 vs REL_15
Greetengs! Found that simple test pgbench -c20 -T20 -j8 gives approximately for REL_15_STABLE at 5143f76: 336+-1 TPS and for REL_16_STABLE at 4ac7635f: 324+-1 TPS The performance drop is approximately 3,5% while the corrected standard deviation is only 0.3%. See the raw_data.txt attached. How do you think, is there any cause for concern here? And is it worth spending time bisecting for the commit where this degradation may have occurred? Would be glad for any comments and concerns. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyREL_15_STABLE at 5143f76, TPS 336.639765 334.376801 334.963121 336.23666 335.698673 REL_16_STABLE at 4ac7635f, TPS 324.373695 323.168622 323.728652 325.799901 324.81759
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Sorry, attached the wrong version of the file. Here is the right one. Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company alg_level_up.pdf Description: Adobe PDF document
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Hello! On 26.07.2023 07:06, Thomas Munro wrote: New patches attached. Are they getting better? It seems to me that it is worth focusing efforts on the second part of the patch, as the most in demand. And try to commit it first. And seems there is a way to simplify it by adding a parameter to get_controlfile() that will return calculated crc and moving the repetition logic level up. There is a proposed algorithm in alg_level_up.pdf attached. [Excuse me, for at least three days i will be in a place where there is no available Internet. \ So will be able to read this thread no earlier than August 2 evening] With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company alg_level_up.pdf Description: Adobe PDF document
Re: [BUG] Crash on pgbench initialization.
On 25.07.2023 06:24, Andres Freund wrote: Thanks Anton / Victoria for the report and fix. Pushed. Thanks! Have a nice day! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
[BUG] Crash on pgbench initialization.
Hello! My colleague Victoria Shepard reported that pgbench might crash during initialization with some values of shared_buffers and max_worker_processes in conf. After some research, found this happens when the LimitAdditionalPins() returns exactly zero. In the current master, this will happen e.g. if shared_buffers = 10MB and max_worker_processes = 40. Then the command "pgbench --initialize postgres" will lead to crash. See the backtrace attached. There is a fix in the patch applied. Please take a look on it. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company#1 0x7f9169557859 in __GI_abort () at abort.c:79 save_stage = 1 act = {__sigaction_handler = {sa_handler = 0x7f91696eabf7, sa_sigaction = 0x7f91696eabf7}, sa_mask = {__val = {1, 140262515851294, 3, 140727328224084, 12, 140262515851298, 2, 3487533463194566656, 7292515507211941683, 94490218952736, 7291953849184874368, 0, 6147461878018348800, 140727328224176, 94490235583952, 140727328225056}}, sa_flags = 938440736, sa_restorer = 0x7ffda268ef80} sigs = {__val = {32, 0 }} #2 0x55f03865f3a8 in ExceptionalCondition (conditionName=0x55f038846b8c "nblocks > 0", fileName=0x55f038846997 "md.c", lineNumber=534) at assert.c:66 No locals. #3 0x55f038479e41 in mdzeroextend (reln=0x55f038ed6e38, forknum=MAIN_FORKNUM, blocknum=0, nblocks=0, skipFsync=false) at md.c:534 v = 0x55f038d96300 curblocknum = 0 remblocks = 0 __func__ = "mdzeroextend" #4 0x55f03847c747 in smgrzeroextend (reln=0x55f038ed6e38, forknum=MAIN_FORKNUM, blocknum=0, nblocks=0, skipFsync=false) at smgr.c:525 No locals. #5 0x55f03842fc72 in ExtendBufferedRelShared (eb=..., fork=MAIN_FORKNUM, strategy=0x55f038e1d7a8, flags=8, extend_by=0, extend_upto=4294967295, buffers=0x7ffda268ba30, extended_by=0x7ffda268b8fc) at bufmgr.c:2057 first_block = 0 io_context = IOCONTEXT_BULKWRITE io_start = {ticks = 0} __func__ = "ExtendBufferedRelShared" #6 0x55f03842f512 in ExtendBufferedRelCommon (eb=..., fork=MAIN_FORKNUM, strategy=0x55f038e1d7a8, flags=8, extend_by=17, extend_upto=4294967295, buffers=0x7ffda268ba30, extended_by=0x7ffda268b9dc) at bufmgr.c:1805 first_block = 22000 #7 0x55f03842de78 in ExtendBufferedRelBy (eb=..., fork=MAIN_FORKNUM, strategy=0x55f038e1d7a8, flags=8, extend_by=17, buffers=0x7ffda268ba30, extended_by=0x7ffda268b9dc) at bufmgr.c:862 No locals. #8 0x55f037f773fa in RelationAddBlocks (relation=0x7f91655d97b8, bistate=0x55f038e1d778, num_pages=17, use_fsm=false, did_unlock=0x7ffda268bb8d) at hio.c:324 victim_buffers = {1, 0, 953770752, 22000, -1570194736, 0, 955084344, 22000, 955072168, 22000, 0, 0, -1570194800, 32765, 944220747, 22000, 16384, 0, 955084344, 22000, 0, 0, 953770752, 22000, -1570194752, 32765, 944228643, 22000, -1570194752, 0, 955084344, 22000, 0, 0, 1700632504, 0, -1570194704, 32765, 939296560, 22000, -1570194624, 0, 1700632504, 32657, 16384, 0, 0, 0, -1570194672, 32765, 943901980, 22000, 8000, 0, 1700632504, 32657, -1570194624, 32765, 943923688, 22000, -1570194512, 0, 1700632504, 32657} first_block = 4294967295 last_block = 4294967295 extend_by_pages = 17 not_in_fsm_pages = 17 buffer = 22000 page = 0xa268ba20 __func__ = "RelationAddBlocks" #9 0x55f037f77d01 in RelationGetBufferForTuple (relation=0x7f91655d97b8, len=128, otherBuffer=0, options=6, bistate=0x55f038e1d778, vmbuffer=0x7ffda268bc2c, vmbuffer_other=0x0, num_pages=17) at hio.c:749 use_fsm = false buffer = 0 page = 0x6a268bc34 nearlyEmptyFreeSpace = 8016 pageFreeSpace = 0 saveFreeSpace = 0 targetFreeSpace = 128 targetBlock = 4294967295 otherBlock = 4294967295 unlockedTargetBuffer = 127 recheckVmPins = false __func__ = "RelationGetBufferForTuple" #10 0x55f037f5e5e2 in heap_multi_insert (relation=0x7f91655d97b8, slots=0x55f038e37b08, ntuples=1000, cid=15, options=6, bistate=0x55f038e1d778) at heapam.c:2193 buffer = 32657 all_visible_cleared = false all_frozen_set = false nthispage = -1570194336 xid = 734 heaptuples = 0x55f038ed1e98 i = 1000 ndone = 0 scratch = {data = "мh\242\001\000\000\000\204\352}e\221\177\000\000\340\274h\242\375\177\000\000\v|F8\360U\000\000\020\275h\242\001\000\000\000\204\352}e\221\177\000\000\020\275h\242\375\177\000\000\211\233F8\360U\000\000\020\275h\001\000\000\224\223\200\352}e\221\177\000\000`\267\241\000\000\000\000 \000\000\000\000\001\000\000\000\260\275h\242\375\177\000\000\242\352B8\004\000\000\000P\275h\242\375\177\000\000\342\333J8\360U\000\000\000\000\000\000\002\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000p\1
Re: Making Vars outer-join aware
On 08.06.2023 19:58, Tom Lane wrote: I think the right thing here is not either of your patches, but to tweak adjust_relid_set() to not fail on negative oldrelid. I'll go make it so. Thanks! This fully solves the problem with ChangeVarNodes() that i wrote above. Hmm. That implies that you're changing plan data structures around after setrefs.c, which doesn't seem like a great design to me --- IMO that ought to happen in PlanCustomPath, which will still see the original varnos. My further searchers led to the fact that it is possible to immediately set the necessary varnos during custom_scan->scan.plan.targetlist creation and leave the сustom_scan->custom_scan_tlist = NIL rather than changing them later using ChangeVarNodes(). This resulted in a noticeable code simplification. Thanks a lot for pointing on it! Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Making Vars outer-join aware
Hello and sorry for the big delay in reply! On 04.05.2023 15:22, Tom Lane wrote: AFAICS the change you propose would serve only to mask bugs. Yes, it's a bad decision. Under what circumstances would you be trying to inject INDEX_VAR into a nullingrel set? Only outer-join relids should ever appear there. The thing is that i don't try to push INDEX_VAR into a nullingrel set at all, i just try to replace the existing rt_index that equals to INDEX_VAR in Var nodes with the defined positive indexes by using ChangeVarNodes_walker() function call. It checks if the nullingrel contains the existing rt_index and does it need to be updated too. It calls bms_is_member() for that, but the last immediately throws an error if the value to be checked is negative like INDEX_VAR. But we are not trying to corrupt the existing nullingrel with this bad index, so it doesn't seem like an serious error. And this index certainly cannot be a member of the Bitmapset. Therefore it also seems better and more logical to me in the case of an index that cannot possibly be a member of the Bitmapset, immediately return false. Here is a patch like that. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit cc1724759b898efc703867a83d38173e4b2794b5 Author: Anton A. Melnikov Date: Mon May 29 13:52:42 2023 +0300 Return false from bms_is_member() if checked value is negative. diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index 7ba3cf635b..3e1db5fda2 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -446,9 +446,9 @@ bms_is_member(int x, const Bitmapset *a) int wordnum, bitnum; - /* XXX better to just return false for x<0 ? */ + /* bitmapset member cannot be negative */ if (x < 0) - elog(ERROR, "negative bitmapset member not allowed"); + return false; if (a == NULL) return false; wordnum = WORDNUM(x);
Re: Making Vars outer-join aware
Hello! I'm having doubts about this fix but most likely i don't understand something. Could you help me to figure it out, please. The thing is that for custom scan nodes as readme says: "INDEX_VAR is abused to signify references to columns of a custom scan tuple type" But INDEX_VAR has a negative value, so it can not be used in varnullingrels bitmapset. And therefore this improvement seems will not work with custom scan nodes and some extensions that use such nodes. If i'm wrong in my doubts and bitmapset for varnullingrels is ok, may be add a check before adjust_relid_set() call like this: @@ -569,9 +569,10 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context) { if (var->varno == context->rt_index) var->varno = context->new_index; - var->varnullingrels = adjust_relid_set(var->varnullingrels, - context->rt_index, - context->new_index); + if (context->rt_index >= 0 && context->new_index >= 0) + var->varnullingrels = adjust_relid_set(var->varnullingrels, + context->rt_index, + With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] Logical replica crash if there was an error in a function.
On 05.04.2023 17:35, Tom Lane wrote: "Anton A. Melnikov" writes: On 03.04.2023 21:49, Tom Lane wrote: I did not think this case was worth memorializing in a test before, and I still do not. I'm inclined to reject this patch. Could you help me to figure out, please. The problem was an Assert that was speculative when it went in, and which we eventually found was wrong in the context of logical replication. We removed the Assert. I don't think we need a test case to keep us from putting back the Assert. That line of thinking leads to test suites that run for fourteen hours and are near useless because developers can't run them easily. regards, tom lane Ok, i understand! Thanks a lot for the clarification. A rather important point, i'll take it into account for the future. Let's do that. Revoked the patch. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! On 03.04.2023 21:49, Tom Lane wrote: "Anton A. Melnikov" writes: Now there are no any pending questions, so moved it to RFC. I did not think this case was worth memorializing in a test before, and I still do not. I'm inclined to reject this patch. Earlier, when discussing this test, there was a suggestion like this: If we were just adding a query or two to an existing scenario, that could be okay; The current version of the test seems to be satisfies this condition. The queries added do not affect the total test time within the measurement error. This case is rare, of cause, but it really took place in practice. So either there are some more reasons why this test should not be accepted that i do not understand, or i misunderstood something from the above. Could you help me to figure out, please. Would be very grateful. Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! On 15.03.2023 21:29, Gregory Stark (as CFM) wrote: These patches that are "Needs Review" and have received no comments at all since before March 1st are these. If your patch is amongst this list I would suggest any of: 1) Move it yourself to the next CF (or withdraw it) 2) Post to the list with any pending questions asking for specific feedback -- it's much more likely to get feedback than just a generic "here's a patch plz review"... 3) Mark it Ready for Committer and possibly post explaining the resolution to any earlier questions to make it easier for a committer to understand the state There were some remarks: 1) very poor use of test cycles (by Tom Lane) Solved in v2 by adding few extra queries to an existent test without any additional syncing. 2) the patch-tester fails on all platforms (by Andres Freund) Fixed in v3. 3) warning with "my" variable $result and suggestion to correct comment (by vignesh C) Both fixed in v4. Now there are no any pending questions, so moved it to RFC. With the best regards! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 16beb574708e0e2e993eb2fa1b093be97b8e53cf Author: Anton A. Melnikov Date: Sun Dec 11 06:26:03 2022 +0300 Add test for syntax error in the function in a a logical replication worker. See dea834938. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 036576acab..b241a7d498 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1'); pass('index predicates do not cause crash'); +# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru + +# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language +# CREATE FUNCTION or DO command executed in a logical replication worker, +# we'd suffer a null pointer dereference or assertion failure. + +$node_subscriber->safe_psql('postgres', q{ + create or replace procedure rebuild_test( + ) as + $body$ + declare + l_code text:= E'create or replace function public.test_selector( + ) returns setof public.tab1 as + \$body\$ + select * from error_name + \$body\$ language sql;'; + begin + execute l_code; + perform public.test_selector(); + end + $body$ language plpgsql; + create or replace function test_trg() + returns trigger as + $body$ + declare + begin + call public.rebuild_test(); + return NULL; + end + $body$ language plpgsql; + create trigger test_trigger after insert on tab1 for each row +execute function test_trg(); + alter table tab1 enable replica trigger test_trigger; +}); + +# This command will cause a crash on the replica if that bug hasn't been fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); + +ok($result, + "ERROR: Logical decoding doesn't fail on function error"); + # We'll re-use these nodes below, so drop their replication state. $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION sub1"); $node_publisher->safe_psql('postgres', "DROP PUBLICATION pub1"); @@ -352,7 +397,7 @@ $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_sch'); # Subscriber's sch1.t1 should receive the row inserted into the new sch1.t1, # but not the row inserted into the old sch1.t1 post-rename. -my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); is( $result, qq(1 2), 'check data in subscriber sch1.t1 after schema rename');
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Hello! On 15.03.2023 21:29, Gregory Stark (as CFM) wrote: These patches that are "Needs Review" and have received no comments at all since before March 1st are these. If your patch is amongst this list I would suggest any of: 1) Move it yourself to the next CF (or withdraw it) 2) Post to the list with any pending questions asking for specific feedback -- it's much more likely to get feedback than just a generic "here's a patch plz review"... 3) Mark it Ready for Committer and possibly post explaining the resolution to any earlier questions to make it easier for a committer to understand the state There are two different patch variants and some discussion expected. So moved them to the next CF. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 6f3ab7b6b5562aaf68ccfdd6dde17c27c2d2869e Author: Anton A. Melnikov Date: Wed Dec 7 10:10:58 2022 +0300 Remove burst growth of the checkpoint_req on replica. with correcttions according to comment: https://www.postgresql.org/message-id/20220907.163946.2214338139097492905.horikyota.ntt%40gmail.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a31fbbff78..02d86bf370 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8935,3 +8935,20 @@ SetWalWriterSleeping(bool sleeping) XLogCtl->WalWriterSleeping = sleeping; SpinLockRelease(>info_lck); } + +/* + * Check if a new checkpoint WAL record has been received since the + * last restartpoint. So it is possible to create a new one. + */ +bool RestartPointAvailable(void) +{ + bool result = false; + + SpinLockAcquire(>info_lck); + if (!XLogRecPtrIsInvalid(XLogCtl->lastCheckPointRecPtr) + && XLogCtl->lastCheckPoint.redo > XLogCtl->RedoRecPtr) +result = true; + SpinLockRelease(>info_lck); + + return result; +} diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 97b882564f..a88c716197 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3196,7 +3196,15 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, { (void) GetRedoRecPtr(); if (XLogCheckpointNeeded(readSegNo)) - RequestCheckpoint(CHECKPOINT_CAUSE_XLOG); +{ + /* + * If there is no new checkpoint WAL records since the + * last restartpoint the creation of new one + * will certainly fail, so skip it. + */ + if (RestartPointAvailable()) + RequestCheckpoint(CHECKPOINT_CAUSE_XLOG); +} } } diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h index f3398425d8..dcc2e64ca7 100644 --- a/src/include/access/xlogrecovery.h +++ b/src/include/access/xlogrecovery.h @@ -83,6 +83,7 @@ extern void InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, bool *haveBackupLabel_ptr, bool *haveTblspcMap_ptr); extern void PerformWalRecovery(void); +extern bool RestartPointAvailable(void); /* * FinishWalRecovery() returns this. It contains information about the point commit 9ec1f90bbe407da94f8e940084e8815b4f8e874b Author: Anton A. Melnikov Date: Wed Dec 7 10:49:19 2022 +0300 Add statistic about restartpoint into pg_stat_bgwriter diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2d8104b090..658cafdc03 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1107,6 +1107,9 @@ CREATE VIEW pg_stat_bgwriter AS SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, +pg_stat_get_bgwriter_timed_restartpoints() AS restartpoints_timed, +pg_stat_get_bgwriter_requested_restartpoints() AS restartpoints_req, +pg_stat_get_bgwriter_performed_restartpoints() AS restartpoints_done, pg_stat_get_checkpoint_write_time() AS checkpoint_write_time, pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 5fc076fc14..2296701ddf 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -350,6 +350,8 @@ CheckpointerMain(void) pg_time_t now; int elapsed_secs; int cur_timeout; + bool chkpt_or_rstpt_requested = false; + bool chkpt_or_rstpt_timed = false; /* Clear any already-pending wakeups */ ResetLatch(MyLatch); @@ -368,7 +370,7 @@ CheckpointerMain(void) if (((volatile CheckpointerShmemStruct *) CheckpointerShmem)->ckpt_flags) { do_checkpoint = true; - PendingCheckpointerStats.requested_checkpoints++; + chkpt_o
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
On 08.03.2023 07:28, Thomas Munro wrote: Sorry, I was confused; please ignore that part. We don't have a copy of the control file anywhere else. (Perhaps we should, but that could be a separate topic.) That’s all right! Fully agreed that this is a possible separate topic. Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
pg_backup_stop() and specify in the docs that one need to copy it from there. I suppose that we have the right to ask the user to perform some manual manipulations here like with backup_label. Could we make better use of the safe copy that we have in the log? Then the pg_backup_start() subproblem would disappear. Conceptually, that'd be just like the way we use FPI for data pages copied during a backup. I'm not sure about any of that, though, it's just an idea, not tested. Sorry, i didn't understand the question about log. Would you explain me please what kind of log did you mention and where can i look this safe copy creation in the code? With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
a mandatory lock) As for external pg_backup_start()-based tools if somebody want to take the lock while copying pg_control i suppose it's a normal case. He may have to wait a bit until we release the lock, like in lock_controlfile(). Moreover this is a very good desire, as it guarantees the integrity of pg_control if only someone is going to use F_SETLKW rather than non-waiting F_SETLK. Backup of locked files is the common place in Windows and all standard backup tools can do it well via VSS (volume shadow copy) including embedded windows backup tool. Just in case, i tested it experimentally. During the torture test first try to copy pg_control and predictably caught: Error: Cannot read C:\pgbins\master\data\global\pg_control! "The process cannot access the file because another process has locked a portion of the file." But copy using own windows backup utility successfully copied it with VSS. > One cute hack I thought of to make the file lock effectively advisory on Windows is to lock a byte range *past the end* of the file (the documentation says you can do that). That shouldn't stop programs that want to read the file without locking and don't know/care about our scheme (that is, pre-existing backup tools that haven't considered this problem and remain oblivious or accept the (low) risk of torn reads), but will block other participants in our scheme. A very interesting idea. It makes sense when someone using external backup tools that can not work with VSS. But the fact of using such a tools under Windows is highly doubtful, i guess. It will not allow to backup many other applications and windows system itself. Let me to join you suggestion that it'd be good to hear from backup gurus what they think about that. or copy garbage on Linux (as they can today, I assume), with non-zero probability -- at least when copying files from a hot standby. Or backup tools might want to get the file contents through some entirely different mechanism that does the right interlocking (whatever that might be, maybe inside the server). Perhaps this is not so much the localised systems programming curiosity I thought it was, and has implications that'd need to be part of the documented low-level backup steps. It makes me like the idea a bit less. It'd be good to hear from backup gurus what they think about that. If we went back to the keep-rereading-until-it-stops-changing model, then an external backup tool would need to be prepared to do that too, in theory at least. Maybe some already do something like that? Or maybe the problem is/was too theoretical before... As far as i understand, this problem has always been, but the probability of this is extremely small in practice, which is directly pointed in the docs [4]: "So while it is theoretically a weak spot, pg_control does not seem to be a problem in practice." Here's a patch like that. In this patch, the problem is solved for the live database and maybe remains for some possible cases of an external backup. In a whole, i think it can be stated that this is a sensible step forward. Just like last time, i ran a long stress test under windows with current patch. There were no errors for more than 3 days even with fsync=off. [1] https://lwn.net/Articles/789600/ [2] https://github.com/ut-osa/txfs [3] https://en.wikipedia.org/wiki/Transactional_NTFS[4] https://www.postgresql.org/docs/devel/wal-internals.html With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Hi, Thomas! On 14.02.2023 06:38, Anton A. Melnikov wrote: Also i did several experiments with fsync=on and found more appropriate behavior: The stress test with sizeof(ControlFileData) = 512+8 = 520 bytes failed in a 4,5 hours, but the other one with ordinary sizeof(ControlFileData) = 296 not crashed in more than 12 hours. Nonetheless it crashed after 18 hours: 2023-02-13 18:07:21.476 MSK [7640] LOG: starting PostgreSQL 16devel, compiled by Visual C++ build 1929, 64-bit 2023-02-13 18:07:21.483 MSK [7640] LOG: listening on IPv6 address "::1", port 5432 2023-02-13 18:07:21.483 MSK [7640] LOG: listening on IPv4 address "127.0.0.1", port 5432 2023-02-13 18:07:21.556 MSK [1940] LOG: database system was shut down at 2023-02-13 18:07:12 MSK 2023-02-13 18:07:21.590 MSK [7640] LOG: database system is ready to accept connections @ sizeof(ControlFileData) = 296 2023-02-13 18:12:21.545 MSK [9532] LOG: checkpoint starting: time 2023-02-13 18:12:21.583 MSK [9532] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.009 s, total=0.038 s; sync files=2, longest=0.005 s, average=0.005 s; distance=0 kB, estimate=0 kB; lsn=0/17AC388, redo lsn=0/17AC350 2023-02-14 12:12:21.738 MSK [8676] ERROR: calculated CRC checksum does not match value stored in file 2023-02-14 12:12:21.738 MSK [8676] CONTEXT: SQL statement "SELECT pg_control_system()" PL/pgSQL function inline_code_block line 1 at PERFORM 2023-02-14 12:12:21.738 MSK [8676] STATEMENT: do $$ begin loop perform pg_control_system(); end loop; end; $$; So all of the following is incorrect: Seems in that case the behavior corresponds to msdn. So if it is possible to use fsync() under windows when the GUC fsync is off it maybe a solution for this problem. If so there is no need to lock the pg_control file under windows at all. and cannot be a solution. Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Hi, Thomas! Thanks for your rapid answer and sorry for my delay with reply. On 01.02.2023 09:45, Thomas Munro wrote: Might add a custom error message for EDEADLK since it absent in errcode_for_file_access()? Ah, good thought. I think it shouldn't happen™, so it's OK that errcode_for_file_access() would classify it as ERRCODE_INTERNAL_ERROR. Yes, i also think that is impossible since the lock is taken on the entire file, so ERRCODE_INTERNAL_ERROR will be right here. Other interesting errors are: ENOLCK: system limits exceeded; PANIC seems reasonable EOPNOTSUPP: this file doesn't support locking (seen on FreeBSD man pages, not on POSIX) Agreed that ENOLCK is a PANIC or at least FATAL. Maybe it's even better to do it FATAL to allow other backends to survive? As for EOPNOTSUPP, maybe make a fallback to the workaround from the first variant of the patch? (In my previous letter i forgot the pause after break;, of cause) I don't know if Windows suffers from this type of problem. Unfortunately its equivalent functionality LockFile() looks non-ideal for this purpose: if your program crashes, the documentation is very vague on when exactly it is released by the OS, but it's not immediately on process exit. That seems non-ideal for a control file you might want to access again very soon after a crash, to be able to recover. Unfortunately i've not had time to reproduce the problem and apply this patch on Windows yet but i'm going to do it soon on windows 10. If a crc error will occur there, then we might use the workaround from the first variant of the patch. Thank you for investigating. I am afraid to read your results. First of all it seemed to me that is not a problem at all since msdn guarantees sector-by-sector atomicity. "Physical Sector: The unit for which read and write operations to the device are completed in a single operation. This is the unit of atomic write..." https://learn.microsoft.com/en-us/windows/win32/fileio/file-buffering https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile (Of course, only if the 512 bytes lays from the beginning of the file with a zero offset, but this is our case. The current size of ControlFileData is 296 bytes at offset = 0.) I tried to verify this fact experimentally and was very surprised. Unfortunately it crashed in an hour during torture test: 2023-02-13 15:10:52.675 MSK [5704] LOG: starting PostgreSQL 16devel, compiled by Visual C++ build 1929, 64-bit 2023-02-13 15:10:52.768 MSK [5704] LOG: database system is ready to accept connections @@ sizeof(ControlFileData) = 296 . 2023-02-13 16:10:41.997 MSK [9380] ERROR: calculated CRC checksum does not match value stored in file But fortunately, this only happens when fsync=off. Also i did several experiments with fsync=on and found more appropriate behavior: The stress test with sizeof(ControlFileData) = 512+8 = 520 bytes failed in a 4,5 hours, but the other one with ordinary sizeof(ControlFileData) = 296 not crashed in more than 12 hours. Seems in that case the behavior corresponds to msdn. So if it is possible to use fsync() under windows when the GUC fsync is off it maybe a solution for this problem. If so there is no need to lock the pg_control file under windows at all. May be choose it in accordance with GUC wal_sync_method? Here's a patch like that. I don't know if it's a good idea for wal_sync_method to affect other kinds of files or not, but, then, it already does (fsync_writethough changes this behaviour). +1. Looks like it needs a little fix: +++ b/src/common/controldata_utils.c @@ -316,7 +316,7 @@ update_controlfile(const char *DataDir, if (pg_fsync(fd) != 0) ereport(PANIC, (errcode_for_file_access(), -errmsg("could not fdatasync file \"%s\": %m", +errmsg("could not fsync file \"%s\": %m", ControlFilePath))); And it may be combined with 0001-Lock-pg_control-while-reading-or-writing.patch I would only want to consider this if we also stop choosing "open_datasync" as a default on at least Windows. I didn't quite understand this point. Could you clarify it for me, please? If the performance of UpdateMinRecoveryPoint() wasn't a problem we could just use fsync in all platforms. Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Hi, Thomas! There are two variants of the patch now. 1) As for the first workaround: On 31.01.2023 07:09, Thomas Munro wrote: Maybe it's unlikely that two samples will match while running that torture test, because it's overwriting the file as fast as it can. But the idea is that a real system isn't writing the control file most of the time. Yeah, I was thinking that we should also put a limit on the loop, just to be cautious. At first i didn’t understand that the equality condition with the previous calculated crc and the current one at the second+ attempts was intended for the case when the pg_control file is really corrupted. Indeed, by making a few debugging variables and running the tortue test, i found that there were ~4000 crc errors (~0.0003%) in ~125 million reads from the file, and there was no case when the crc error appeared twice in a row. So the second and moreover the third successive occurrence of an crc error can be neglected and for this workaround seems a simpler and maybe more clear algorithm is possible. For instance: for(try = 0 ; try < 3; try++) { open, read from and close pg_control; calculate crc; *crc_ok_p = EQ_CRC32C(crc, ControlFile->crc); if(*crc_ok_p) break; } 2) As for the second variant of the patch with POSIX locks: On 31.01.2023 14:38, Thomas Munro wrote: On Tue, Jan 31, 2023 at 5:09 PM Thomas Munro wrote: Clearly there is an element of speculation or superstition here. I don't know what else to do if both PostgreSQL and ext4 decided not to add interlocking. Maybe we should rethink that. How bad would it really be if control file access used POSIX file locking? I mean, the writer is going to *fsync* the file, so it's not like one more wafer thin system call is going to hurt too much. Here's an experimental patch for that alternative. I wonder if someone would want to be able to turn it off for some reason -- maybe some NFS problem? It's less back-patchable, but maybe more principled? It looks very strange to me that there may be cases where the cluster data is stored in NFS. Can't figure out how this can be useful. i think this variant of the patch is a normal solution of the problem, not workaround. Found no problems on Linux. +1 for this variant. Might add a custom error message for EDEADLK since it absent in errcode_for_file_access()? I don't know if Windows suffers from this type of problem. Unfortunately its equivalent functionality LockFile() looks non-ideal for this purpose: if your program crashes, the documentation is very vague on when exactly it is released by the OS, but it's not immediately on process exit. That seems non-ideal for a control file you might want to access again very soon after a crash, to be able to recover. Unfortunately i've not had time to reproduce the problem and apply this patch on Windows yet but i'm going to do it soon on windows 10. If a crc error will occur there, then we might use the workaround from the first variant of the patch. A thought in passing: if UpdateMinRecoveryPoint() performance is an issue, maybe we should figure out how to use fdatasync() instead of fsync(). May be choose it in accordance with GUC wal_sync_method? Sincerely yours, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Hello! On 24.11.2022 04:02, Thomas Munro wrote: On Thu, Nov 24, 2022 at 11:05 AM Tom Lane wrote: Thomas Munro writes: ERROR: calculated CRC checksum does not match value stored in file The attached draft patch fixes it. Tried to catch this error on my PC, but failed to do it within a reasonable time. The 1s interval is probably too long for me. It seems there are more durable way to reproduce this bug with 0001 patch applied: At the first backend: do $$ begin loop perform pg_update_control_file(); end loop; end; $$; At the second one: do $$ begin loop perform pg_control_system(); end loop; end; $$; It will fails almost immediately with: "ERROR: calculated CRC checksum does not match value stored in file" both with fsync = off and fsync = on. Checked it out for master and REL_11_STABLE. Also checked for a few hours that the patch 0002 fixes this error, but there are some questions to its logical structure. The equality between the previous and newly calculated crc is checked only if the last crc calculation was wrong, i.e not equal to the value stored in the file. It is very unlikely that in this case the previous and new crc can match, so, in fact, the loop will spin until crc is calculated correctly. In the other words, this algorithm is practically equivalent to an infinite loop of reading from a file and calculating crc while(EQ_CRC32C(crc, ControlFile->crc) != true). But then it can be simplified significantly by removing checksums equality checks, bool fist_try and by limiting the maximum number of iterations with some constant in the e.g. for loop to avoid theoretically possible freeze. Or maybe use the variant suggested by Tom Lane, i.e, as far as i understand, repeat the file_open-read-close-calculate_crc sequence twice without a pause between them and check the both calculated crcs for the equality. If they match, exit and return the bool result of comparing between the last calculation with the value from the file, if not, take a pause and repeat everything from the beginning. In this case, no need to check *crc_ok_p inside get_controlfile() as it was in the present version; i think it's more logically correct since this variable is intended top-level functions and the logic inside get_controlfile() should not depend on its value. Also found a warning in 0001 patch for master branch. On my PC gcc gives: xlog.c:2507:1: warning: no previous prototype for ‘pg_update_control_file’ [-Wmissing-prototypes] 2507 | pg_update_control_file() Fixed it with #include "utils/fmgrprotos.h" to xlog.c and add PG_FUNCTION_ARGS to pg_update_control_file(). With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] Logical replica crash if there was an error in a function.
Thanks for your remarks. On 07.01.2023 15:27, vignesh C wrote: Few suggestions: 1) There is a warning: +# This would crash on the subscriber if not fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); "my" variable $result masks earlier declaration in same scope at t/100_bugs.pl line 400. You can change: my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); to $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); The reason is that the patch fell behind the master. Fixed in v4 together with rebasing on current master. 2) Now that the crash is fixed, you could change it to a better message: +# This would crash on the subscriber if not fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); Tried to make this comment more clear. Best wishes for the new year! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 16beb574708e0e2e993eb2fa1b093be97b8e53cf Author: Anton A. Melnikov Date: Sun Dec 11 06:26:03 2022 +0300 Add test for syntax error in the function in a a logical replication worker. See dea834938. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 036576acab..b241a7d498 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1'); pass('index predicates do not cause crash'); +# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru + +# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language +# CREATE FUNCTION or DO command executed in a logical replication worker, +# we'd suffer a null pointer dereference or assertion failure. + +$node_subscriber->safe_psql('postgres', q{ + create or replace procedure rebuild_test( + ) as + $body$ + declare + l_code text:= E'create or replace function public.test_selector( + ) returns setof public.tab1 as + \$body\$ + select * from error_name + \$body\$ language sql;'; + begin + execute l_code; + perform public.test_selector(); + end + $body$ language plpgsql; + create or replace function test_trg() + returns trigger as + $body$ + declare + begin + call public.rebuild_test(); + return NULL; + end + $body$ language plpgsql; + create trigger test_trigger after insert on tab1 for each row +execute function test_trg(); + alter table tab1 enable replica trigger test_trigger; +}); + +# This command will cause a crash on the replica if that bug hasn't been fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); + +ok($result, + "ERROR: Logical decoding doesn't fail on function error"); + # We'll re-use these nodes below, so drop their replication state. $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION sub1"); $node_publisher->safe_psql('postgres', "DROP PUBLICATION pub1"); @@ -352,7 +397,7 @@ $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_sch'); # Subscriber's sch1.t1 should receive the row inserted into the new sch1.t1, # but not the row inserted into the old sch1.t1 post-rename. -my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1"); is( $result, qq(1 2), 'check data in subscriber sch1.t1 after schema rename');
Re: [BUG] pg_upgrade test fails from older versions.
On 27.12.2022 16:50, Michael Paquier wrote: If there are no other considerations could you close the corresponding record on the January CF, please? Indeed, now marked as committed. - Thanks a lot! Merry Christmas! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] pg_upgrade test fails from older versions.
Hello! On 27.12.2022 08:44, Michael Paquier wrote: It is worth noting that perlcritic was complaining here, as eval is getting used with a string. I have spent a few days looking at that, and I really want a maximum of flexibility in the rules that can be applied so I have put a "no critic" rule, which is fine by me as this extra file is something owned by the user and it would apply only to cross-version upgrades. I think it's a very smart decision. Thank you very match! So it looks like we are now done here.. With all these pieces in place in the tests, I don't see why it would not be possible to automate the cross-version tests of pg_upgrade. I've checked the cross-upgrade test form 9.5+ to current master and have found no problem with accuracy up to dumps filtering. For cross-version tests automation one have to write additional filtering rules in the external files. I would like to try realize this, better in a separate thread. If there are no other considerations could you close the corresponding record on the January CF, please? With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] pg_upgrade test fails from older versions.
Hello! On 23.12.2022 05:42, Michael Paquier wrote: On Thu, Dec 22, 2022 at 09:59:18AM +0300, Anton A. Melnikov wrote: 2) v2-0002-Additional-dumps-filtering.patch + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; This should not be in 0002, I guess.. Made a separate patch for it: v3-0001-Fix-dumps-filtering.patch On 26.12.2022 05:52, Michael Paquier wrote: On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote: I was looking at 0002 to add a callback to provide custom filtering rules. + my @ext_filter = split('\/', $_); Are you sure that enforcing a separation with a slash is a good idea? What if the filters include directory paths or characters that are escaped, for example? Rather than introducing a filter.regex, I would tend to just document that in the README with a small example. I have been considering a few alternatives while making this useful in most cases, still my mind alrways comes back to the simplest thing we to just read each line of the file, chomp it and apply the pattern to the log file.. Thanks for your attention! Yes, indeed. It will be really simpler. Made it in the v3-0002-Add-external-dumps-filtering.patch With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 602627daf32a6d33ae96ce8fbae918c9f00f0633 Author: Anton A. Melnikov Date: Mon Dec 26 06:21:32 2022 +0300 Fix dupms filtering in pg_upgrade test from older versions. Replace specific privilegies in dumps with ALL due to b5d63824 and 60684dd8. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 4cc1469306..d23c4b2253 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -44,6 +44,8 @@ sub filter_dump $dump_contents =~ s/^\-\-.*//mgx; # Remove empty lines. $dump_contents =~ s/^\n//mgx; + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; my $dump_file_filtered = "${dump_file}_filtered"; open(my $dh, '>', $dump_file_filtered) commit 3870fb8263dc7e3fa240753b4eb89945b8d229be Author: Anton A. Melnikov Date: Thu Dec 22 08:01:40 2022 +0300 Add external dumps filtering during upgrade test. diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING index 127dc30bbb..b8cce2bae1 100644 --- a/src/bin/pg_upgrade/TESTING +++ b/src/bin/pg_upgrade/TESTING @@ -56,3 +56,12 @@ Once the dump is created, it can be repeatedly used with $olddump and the dump out of the new database and the comparison of the dumps between the old and new databases. The contents of the dumps can also be manually compared. + +You can use additional dump filtering. To do this, you need to define the +'filter' environment variable and specify the path to the file with +filtering rules in it. Here is an example contens of such a file: +# examples: +# Remove all CREATE POLICY statements +s/^CREATE\sPOLICY.*//mgx +# Replace REFRESH with DROP for materialized views +s/^REFRESH\s(MATERIALIZED\sVIEW)/DROP $1/mgx diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 4cc1469306..e094fd08c3 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -45,6 +45,30 @@ sub filter_dump # Remove empty lines. $dump_contents =~ s/^\n//mgx; + if (defined($ENV{filter})) + { + # Use the external filter + my $ext_filter_file = $ENV{filter}; + die "no file with external filter found!" unless -e $ext_filter_file; + + open my $ext_filter_handle, '<', $ext_filter_file; + chomp(my @ext_filter_lines = <$ext_filter_handle>); + close $ext_filter_handle; + + foreach (@ext_filter_lines) + { + # omit lines with comments + if (substr($_, 0, 1) eq '#') + { +next; + } + + # apply lines with filters + my $filter = "\$dump_contents =~ $_"; + eval $filter; + } + } + my $dump_file_filtered = "${dump_file}_filtered"; open(my $dh, '>', $dump_file_filtered) || die "opening $dump_file_filtered";
Re: [BUG] pg_upgrade test fails from older versions.
Sorry, didn't get to see the last letter! On 23.12.2022 11:51, Michael Paquier wrote: FWIW, I find the use of a FOR loop with a DO block much cleaner to follow in this context, so something like the attached would be able to group the two queries and address your point on O(N^2). Do you like that? -- Michael The query: DO $$ DECLARE rec record; BEGIN FOR rec in SELECT oid::regclass::text as rel, attname as col FROM pg_class c, pg_attribute a WHERE c.relname !~ '^pg_' AND c.relkind IN ('r') AND a.attrelid = c.oid AND a.atttypid = 'aclitem'::regtype ORDER BY 1 LOOP EXECUTE 'ALTER TABLE ' || quote_ident(rec.rel) || ' ALTER COLUMN ' || quote_ident(rec.col) || ' SET DATA TYPE text'; END LOOP; END; $$; gives the average time of 36 ms at the same conditions. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [BUG] pg_upgrade test fails from older versions.
Hello! On 23.12.2022 06:27, Justin Pryzby wrote: This would do a single seqscan: SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype; -- AND ... \gexec Touched a bit on how long it takes to execute different types of queries on my PC. At each measurement, the server restarted with a freshly copied regression database. 1) DO $$ DECLARE change_aclitem_type TEXT; BEGIN FOR change_aclitem_type IN SELECT 'ALTER TABLE ' || table_schema || '.' || table_name || ' ALTER COLUMN ' || column_name || ' SET DATA TYPE text;' AS change_aclitem_type FROM information_schema.columns WHERE data_type = 'aclitem' and table_schema != 'pg_catalog' LOOP EXECUTE change_aclitem_type; END LOOP; END; $$; 2) DO $$ DECLARE rec text; col text; BEGIN FOR rec in SELECT oid::regclass::text FROM pg_class WHERE relname !~ '^pg_' AND relkind IN ('r') ORDER BY 1 LOOP FOR col in SELECT attname FROM pg_attribute WHERE attrelid::regclass::text = rec AND atttypid = 'aclitem'::regtype LOOP EXECUTE 'ALTER TABLE ' || quote_ident(rec) || ' ALTER COLUMN ' || quote_ident(col) || ' SET DATA TYPE text'; END LOOP; END LOOP; END; $$; 3) SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype; \gexec 4) The same as 3) but in the DO block DO $$ DECLARE change_aclitem_type TEXT; BEGIN FOR change_aclitem_type IN SELECT 'ALTER TABLE ' || attrelid::regclass || ' ALTER COLUMN ' || attname || ' TYPE TEXT;' AS change_aclitem_type FROM pg_attribute WHERE atttypid = 'aclitem'::regtype LOOP EXECUTE change_aclitem_type; END LOOP; END; $$; Average execution time for three times: _ |N of query: | 1 | 2 | 3 | 4 | | |Avg time, ms: | 58 | 1076 | 51 | 33 | | Raw results in timing.txt Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company 1) Time: 53,699 ms Time: 60,146 ms Time: 60,594 ms Avg: 58,1 ms 2) Time: 1020,832 ms Time: 1061,554 ms (00:01,062) Time: 1148,029 ms (00:01,148) Avg: 1076 ms 3) Time: 20,972 ms regression=# \gexec ALTER TABLE Time: 12,601 ms Time: 3,106 ms sum = 36,67 Time: 22,087 ms regression=# \gexec ALTER TABLE Time: 40,768 ms Time: 3,154 ms sum = 66,01 Time: 13,865 ms regression=# \gexec ALTER TABLE Time: 34,619 ms Time: 3,063 ms sum = 51,55 Avg: 51,4 ms 4) Time: 25,518 ms ime: 35,746 ms Time: 39,232 ms Avg: 33,4 ms
Re: [BUG] pg_upgrade test fails from older versions.
Hello! Divided patch into two parts: first part refers to the modification of the old dump while the second one relates to dump filtering. 1) v2-0001-Remove-aclitem-from-old-dump.patch On 19.12.2022 06:10, Michael Paquier wrote: This is forgetting about materialized views, which is something that pg_upgrade would also complain about when checking for relations with aclitems. As far as I can see, the only place in the main regression test suite where we have an aclitem attribute is tab_core_types for HEAD and the stable branches, so it would be enough to do this change. Anyway, wouldn't it be better to use the same conditions as the WITH OIDS queries a few lines above, at least for consistency? Note that check_for_data_types_usage() checks for tables, matviews and indexes. Found that 'ALTER ... ALTER COLUMN SET DATA TYPE text' is not applicable to materialized views and indexes as well as DROP COLUMN. So couldn't make anything better than drop its in the old dump if they contain at least one column of 'aclitem' type. i've tested this script with: CREATE TABLE acltable AS SELECT 1 AS int, 'postgres=a/postgres'::aclitem AS aclitem; CREATE MATERIALIZED VIEW aclmview AS SELECT 'postgres=a/postgres'::aclitem AS aclitem; CREATE INDEX aclindex on acltable (int) INCLUDE (aclitem); performed in the regression database before creating the old dump. The only thing i haven't been able to find a case when an an 'acltype' column would be preserved in the index when this type was replaced in the parent table. So passing relkind = 'i' is probably redundant. If it is possible to find such a case, it would be very interesting. Also made the replacement logic for 'acltype' in the tables more closer to above the script that removes OIDs columns. In this script found likewise that ALTER TABLE ... SET WITHOUT OIDS is not applicable to materialized views and ALTER MATERIALIZED VIEW doesn't support WITHOUT OIDS clause. Besides i couldn't find any legal way to create materialized view with oids in versions 11 or lower. Command 'CREATE MATERIALIZED VIEW' doesn't support WITH OIDS or (OIDS) clause, as well as ALTER MATERIALIZED VIEW as mentioned above. Even with GUC default_with_oids = true": CREATE TABLE oidtable AS SELECT 1 AS int; CREATE MATERIALIZED VIEW oidmv AS SELECT * FROM oidtable; give: postgres=# SELECT oid::regclass::text FROM pg_class WHERE relname !~ '^pg_' AND relhasoids; oid -- oidtable (1 row) So suggest to exclude the check of materialized views from this DO block. Would be grateful for remarks if i didn't consider some cases. 2) v2-0002-Additional-dumps-filtering.patch On 19.12.2022 06:16, Michael Paquier wrote: While thinking about that, an extra idea popped in my mind as it may be interesting to be able to filter out some of the diffs in some contexts. So what about adding in 002_pg_upgrade.pl a small-ish hook in the shape of a new environment variable pointing to a file adds some custom filtering rules? Yes. Made a hook that allows to proceed an external text file with additional filtering rules and example of such file. Please take a look on it. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit f4a6f18f008ab4acd0d6923ddce7aa6d20bef08f Author: Anton A. Melnikov Date: Thu Dec 22 07:22:53 2022 +0300 Remove type 'aclitem' from old dump when test upgrade from older versions. diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql b/src/bin/pg_upgrade/upgrade_adapt.sql index 27c4c7fd01..84389f3e5b 100644 --- a/src/bin/pg_upgrade/upgrade_adapt.sql +++ b/src/bin/pg_upgrade/upgrade_adapt.sql @@ -19,7 +19,8 @@ SELECT ver <= 906 AS oldpgversion_le96, ver <= 1000 AS oldpgversion_le10, ver <= 1100 AS oldpgversion_le11, - ver <= 1300 AS oldpgversion_le13 + ver <= 1300 AS oldpgversion_le13, + ver <= 1500 AS oldpgversion_le15 FROM (SELECT current_setting('server_version_num')::int / 100 AS ver) AS v; \gset @@ -72,7 +73,7 @@ DO $stmt$ FROM pg_class WHERE relname !~ '^pg_' AND relhasoids - AND relkind in ('r','m') + AND relkind = 'r' ORDER BY 1 LOOP execute 'ALTER TABLE ' || rec || ' SET WITHOUT OIDS'; @@ -89,3 +90,58 @@ DROP OPERATOR public.#%# (pg_catalog.int8, NONE); DROP OPERATOR public.!=- (pg_catalog.int8, NONE); DROP OPERATOR public.#@%# (pg_catalog.int8, NONE); \endif + +-- The internal format of "aclitem" changed in PostgreSQL version 16 +-- so replace it with text type in tables and drop materialized views +-- and indexes that contain column(s) of "aclitem" type. +\if :oldpgversion_le15 +DO $$ + DECLARE +rec text; + col text; + BEGIN + FOR rec in +SELECT oid::regclass::text +FROM pg_class +WHERE relname !~ '^pg_' + AND relkind = 'r' +ORDER BY 1 + LOOP + FOR col in + SELECT attname + FROM pg_attribute + WHERE attrelid::regclass::text = rec + AND atttypid = 'aclit
Re: [PATCH] Backport perl tests for pg_upgrade from 322becb60
Hello! On 09.12.2022 08:19, Michael Paquier wrote: On Mon, Aug 01, 2022 at 01:02:21AM +0300, Anton A. Melnikov wrote: As far as i understand from this thread: https://www.postgresql.org/message-id/flat/Yox1ME99GhAemMq1%40paquier.xyz, the aim of the perl version for the pg_upgrade tests is to achieve equality of dumps for most cross-versions cases. If so this is the significant improvement as previously in test.sh resulted dumps retained unequal and the user was asked to eyeball them manually during cross upgrades between different major versions. So, the backport of the perl tests also seems preferable to me. I don't really agree with that. These TAP tests are really new development, and it took a few tries to get them completely right (well, as much right as it holds for HEAD). If we were to backport any of this, there is a risk of introducing a bug in what we do with any of that, potentially hiding a issue critical related to pg_upgrade. That's not worth taking a risk for. Saying that, I agree that more needs to be done, but I would limit that only to HEAD and let it mature more into the tree in an incremental fashion. -- I have withdrawn the patch with the backport, but then the question is whether we will make fixes in older test.sh tests seems to be remains open. Will we fix it? Justin is not sure if anyone needs this: https://www.postgresql.org/message-id/67b6b447-e9cb-ebde-4a6b-127aea7ca268%40inbox.ru Also found that the test from older versions fails in the current master. Proposed a fix in a new thread: https://www.postgresql.org/message-id/49f389ba-95ce-8a9b-09ae-f60650c0e7c7%40inbox.ru Would be glad to any remarks. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
[BUG] pg_upgrade test fails from older versions.
Hello! Found that pg_upgrade test has broken for upgrades from older versions. This happened for two reasons. 1) In 7b378237a the format of "aclitem" changed so upgrade from <=15 fails with error: "Your installation contains the "aclitem" data type in user tables. The internal format of "aclitem" changed in PostgreSQL version 16 so this cluster cannot currently be upgraded... " Tried to fix it by changing the column type in the upgrade_adapt.sql. Please see the patch attached. 2) In 60684dd83 and b5d63824 there are two changes in the set of specific privileges. The thing is that in the privileges.sql test there is REVOKE DELETE command which becomes pair of REVOKE ALL and GRANT all specific privileges except DELETE in the result dump. Therefore, any change in the set of specific privileges will lead to a non-zero dumps diff. To avoid this, i propose to replace any specific GRANT and REVOKE in the result dumps with ALL. This also made in the patch attached. Would be glad to any remarks. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit e2c917694acc7ae20d6a9654de87a9fdb5863103 Author: Anton A. Melnikov Date: Sun Dec 18 16:23:24 2022 +0300 Replace aclitem with text type in pg_upgrade test due to 7b378237 Replace specific privilegies in dumps with ALL due to b5d63824 and 60684dd8. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 4cc1469306..d23c4b2253 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -44,6 +44,8 @@ sub filter_dump $dump_contents =~ s/^\-\-.*//mgx; # Remove empty lines. $dump_contents =~ s/^\n//mgx; + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; my $dump_file_filtered = "${dump_file}_filtered"; open(my $dh, '>', $dump_file_filtered) diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql b/src/bin/pg_upgrade/upgrade_adapt.sql index 27c4c7fd01..fac77ec968 100644 --- a/src/bin/pg_upgrade/upgrade_adapt.sql +++ b/src/bin/pg_upgrade/upgrade_adapt.sql @@ -19,7 +19,8 @@ SELECT ver <= 906 AS oldpgversion_le96, ver <= 1000 AS oldpgversion_le10, ver <= 1100 AS oldpgversion_le11, - ver <= 1300 AS oldpgversion_le13 + ver <= 1300 AS oldpgversion_le13, + ver <= 1500 AS oldpgversion_le15 FROM (SELECT current_setting('server_version_num')::int / 100 AS ver) AS v; \gset @@ -89,3 +90,24 @@ DROP OPERATOR public.#%# (pg_catalog.int8, NONE); DROP OPERATOR public.!=- (pg_catalog.int8, NONE); DROP OPERATOR public.#@%# (pg_catalog.int8, NONE); \endif + +-- The internal format of "aclitem" changed in PostgreSQL version 16 +-- so replace it with text type +\if :oldpgversion_le15 +DO $$ +DECLARE +change_aclitem_type TEXT; +BEGIN +FOR change_aclitem_type IN +SELECT 'ALTER TABLE ' || table_schema || '.' || +table_name || ' ALTER COLUMN ' || + column_name || ' SET DATA TYPE text;' +AS change_aclitem_type +FROM information_schema.columns +WHERE data_type = 'aclitem' and table_schema != 'pg_catalog' +LOOP +EXECUTE change_aclitem_type; +END LOOP; +END; +$$; +\endif
Re: [BUG] Logical replica crash if there was an error in a function.
On 07.12.2022 21:03, Andres Freund wrote: This CF entry causes tests to fail on all platforms: https://cirrus-ci.com/build/5755408111894528 E.g. https://api.cirrus-ci.com/v1/artifact/task/5298457144459264/testrun/build/testrun/subscription/100_bugs/log/regress_log_100_bugs Begin standard error psql::1: NOTICE: dropped replication slot "sub1" on publisher End standard error timed out waiting for match: ERROR: relation "error_name" does not exist at character at /tmp/cirrus-ci-build/src/test/subscription/t/100_bugs.pl line 115. Greetings, Andres Freund Thank you for reminding! There was a conflict when applying v2 on current master. Rebased v3 is attached. Best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 07eaa674953ed700a53174410a6e1eb81151d7e8 Author: Anton A. Melnikov Date: Sun Dec 11 06:26:03 2022 +0300 Add test for syntax error in the function in a a logical replication worker. See dea834938. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 7b3cd66be5..0bf4fdfa47 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1'); pass('index predicates do not cause crash'); +# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru + +# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language +# CREATE FUNCTION or DO command executed in a logical replication worker, +# we'd suffer a null pointer dereference or assertion failure. + +$node_subscriber->safe_psql('postgres', q{ + create or replace procedure rebuild_test( + ) as + $body$ + declare + l_code text:= E'create or replace function public.test_selector( + ) returns setof public.tab1 as + \$body\$ + select * from error_name + \$body\$ language sql;'; + begin + execute l_code; + perform public.test_selector(); + end + $body$ language plpgsql; + create or replace function test_trg() + returns trigger as + $body$ + declare + begin + call public.rebuild_test(); + return NULL; + end + $body$ language plpgsql; + create trigger test_trigger after insert on tab1 for each row +execute function test_trg(); + alter table tab1 enable replica trigger test_trigger; +}); + +# This would crash on the subscriber if not fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); + +ok($result, + "ERROR: Logical decoding doesn't fail on function error"); + # We'll re-use these nodes below, so drop their replication state. # We don't bother to drop the tables though. $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION sub1");
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Hello! On 06.12.2022 21:44, Andres Freund wrote: Hi, On 2022-09-19 01:29:21 +0300, Anton A. Melnikov wrote: Corrected patch is attached (v2-0001-Fix-burst-checkpoint_req-growth.patch). This patch doesn't pass the main regression tests tests successfully: https://cirrus-ci.com/task/5502700019253248 diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/rules.out /tmp/cirrus-ci-build/build/testrun/regress/regress/results/rules.out --- /tmp/cirrus-ci-build/src/test/regress/expected/rules.out2022-12-06 05:49:53.687424000 + +++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/rules.out 2022-12-06 05:53:04.64269 + @@ -1816,6 +1816,9 @@ FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, stats_reset); pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, +pg_stat_get_bgwriter_timed_restartpoints() AS restartpoints_timed, +pg_stat_get_bgwriter_requested_restartpoints() AS restartpoints_req, +pg_stat_get_bgwriter_performed_restartpoints() AS restartpoints_done, pg_stat_get_checkpoint_write_time() AS checkpoint_write_time, pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, Greetings, Andres Freund Thank you for pointing! I didn't think that the patch tester would apply both patch variants simultaneously, assuming that these are two different possible solutions of the problem. But it's even good, let it check both at once! There was an error in the second variant (Add-restartpoint-stats), i forgot to correct the rules.out. So fixed the second variant and rebased the first one (Fix-burst-checkpoint_req-growth) to the current master. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 6f3ab7b6b5562aaf68ccfdd6dde17c27c2d2869e Author: Anton A. Melnikov Date: Wed Dec 7 10:10:58 2022 +0300 Remove burst growth of the checkpoint_req on replica. with correcttions according to comment: https://www.postgresql.org/message-id/20220907.163946.2214338139097492905.horikyota.ntt%40gmail.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a31fbbff78..02d86bf370 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8935,3 +8935,20 @@ SetWalWriterSleeping(bool sleeping) XLogCtl->WalWriterSleeping = sleeping; SpinLockRelease(>info_lck); } + +/* + * Check if a new checkpoint WAL record has been received since the + * last restartpoint. So it is possible to create a new one. + */ +bool RestartPointAvailable(void) +{ + bool result = false; + + SpinLockAcquire(>info_lck); + if (!XLogRecPtrIsInvalid(XLogCtl->lastCheckPointRecPtr) + && XLogCtl->lastCheckPoint.redo > XLogCtl->RedoRecPtr) +result = true; + SpinLockRelease(>info_lck); + + return result; +} diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 97b882564f..a88c716197 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3196,7 +3196,15 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, { (void) GetRedoRecPtr(); if (XLogCheckpointNeeded(readSegNo)) - RequestCheckpoint(CHECKPOINT_CAUSE_XLOG); +{ + /* + * If there is no new checkpoint WAL records since the + * last restartpoint the creation of new one + * will certainly fail, so skip it. + */ + if (RestartPointAvailable()) + RequestCheckpoint(CHECKPOINT_CAUSE_XLOG); +} } } diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h index f3398425d8..dcc2e64ca7 100644 --- a/src/include/access/xlogrecovery.h +++ b/src/include/access/xlogrecovery.h @@ -83,6 +83,7 @@ extern void InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, bool *haveBackupLabel_ptr, bool *haveTblspcMap_ptr); extern void PerformWalRecovery(void); +extern bool RestartPointAvailable(void); /* * FinishWalRecovery() returns this. It contains information about the point commit 9ec1f90bbe407da94f8e940084e8815b4f8e874b Author: Anton A. Melnikov Date: Wed Dec 7 10:49:19 2022 +0300 Add statistic about restartpoint into pg_stat_bgwriter diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2d8104b090..658cafdc03 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1107,6 +1107,9 @@ CREATE VIEW pg_stat_bgwriter AS SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_
Re: [PATCH] Add peer authentication TAP test
On 25.11.2022 10:34, Michael Paquier wrote: On Fri, Nov 25, 2022 at 10:13:29AM +0300, Anton A. Melnikov wrote: The test fails almost at the beginning in reset_pg_hba call after modification pg_hba.conf and node reloading: #t/003_peer.pl .. Dubious, test returned 2 (wstat 512, 0x200) #No subtests run Logs regress_log_003_peer and 003_peer_node.log are attached. Yeah, that's failing exactly at the position I am pointing to. I'll go apply what you have.. -- Michael Thanks! With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH] Add peer authentication TAP test
Hello, thanks for rapid answer! On 25.11.2022 08:18, Michael Paquier wrote: You are not using MSVC but MinGW, are you? The buildfarm members with TAP tests enabled are drongo, fairywren, bowerbord and jacana. Even though none of them are running the tests from src/test/authentication/, this is running on a periodic basis in the CI, where we are able to skip the test in MSVC already: postgresql:authentication / authentication/003_peer SKIP 9.73s There is MSVC on my PC. The project was build with Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30136 for x64 So yes, it is plausible that we are missing more safeguards here. Your suggestion to skip under !$use_unix_sockets makes sense, as not having unix sockets is not going to work for peer and WIN32 needs SSPI to be secure with pg_regress. Where is your test failing? On the first $node->psql('postgres') at the beginning of the test? Just wondering.. The test fails almost at the beginning in reset_pg_hba call after modification pg_hba.conf and node reloading: #t/003_peer.pl .. Dubious, test returned 2 (wstat 512, 0x200) #No subtests run Logs regress_log_003_peer and 003_peer_node.log are attached. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company2022-11-25 09:55:49.731 MSK [7648] LOG: starting PostgreSQL 16devel, compiled by Visual C++ build 1929, 64-bit 2022-11-25 09:55:49.735 MSK [7648] LOG: listening on IPv4 address "127.0.0.1", port 60161 2022-11-25 09:55:49.773 MSK [2892] LOG: database system was shut down at 2022-11-25 09:55:49 MSK 2022-11-25 09:55:49.800 MSK [7648] LOG: database system is ready to accept connections 2022-11-25 09:55:49.890 MSK [7648] LOG: received SIGHUP, reloading configuration files 2022-11-25 09:55:50.003 MSK [84] [unknown] LOG: connection received: host=127.0.0.1 port=49822 2022-11-25 09:55:50.007 MSK [84] [unknown] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "postgres", database "postgres", no encryption 2022-11-25 09:55:50.114 MSK [3356] [unknown] LOG: connection received: host=127.0.0.1 port=49823 2022-11-25 09:55:50.117 MSK [3356] [unknown] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "postgres", database "postgres", no encryption 2022-11-25 09:55:50.201 MSK [7648] LOG: received immediate shutdown request 2022-11-25 09:55:50.212 MSK [7648] LOG: database system is shut down # Checking port 60161 # Found port 60161 Name: node Data directory: c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/pgdata Backup directory: c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/backup Archive directory: c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/archives Connection string: port=60161 host=127.0.0.1 Log file: c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/log/003_peer_node.log # Running: initdb -D c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/pgdata -A trust -N The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "English_United States.1252". The default database encoding has accordingly been set to "WIN1252". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory c:/projects/1c-master-7397/postgrespro/src/test/authentication/tmp_check/t_003_peer_node_data/pgdata ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... windows selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Europe/Moscow creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok Sync to disk skipped. The data directory might become corrupt if the operating system crashes. Success. You can now start the database server using: pg_ctl -D ^"c^:^/projects^/1c^-master^-7397^/postgrespro^/src^\test^\authentication^/tmp^_check^/t^_003^_peer^_node^_data^/pgdata^" -l logfile start # Running: c:/projects/1c-master-7397/postgrespro/Release/pg_regress/pg_regress --config-auth c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/pgdata ### Starting node "node" # Running: pg_ctl -w -D c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/t_003_peer_node_data/pgdata -l c:/projects/1c-master-7397/postgrespro/src\test\authentication/tmp_check/log/003_peer_node.log -o --cluster-name=node start waiting for server to start done server started # Postmaster PID for node "node" is 7648 ### Reloading
Re: [PATCH] Add peer authentication TAP test
Hello! On Windows this test fails with error: # connection error: 'psql: error: connection to server at "127.0.0.1", port x failed: # FATAL: no pg_hba.conf entry for host "127.0.0.1", user "buildfarm", database "postgres", no encryption' May be disable this test for windows like in 001_password.pl and 002_saslprep.pl? Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit e4491b91729b307d29ce0205b455936b3a538373 Author: Anton A. Melnikov Date: Fri Nov 25 07:40:11 2022 +0300 Skip test 003_peer.pl for windows like 001_password.pl and 002_saslprep.pl diff --git a/src/test/authentication/t/003_peer.pl b/src/test/authentication/t/003_peer.pl index ce8408a4f8c..7454bf4a598 100644 --- a/src/test/authentication/t/003_peer.pl +++ b/src/test/authentication/t/003_peer.pl @@ -9,6 +9,11 @@ use warnings; use PostgreSQL::Test::Cluster; use PostgreSQL::Test::Utils; use Test::More; +if (!$use_unix_sockets) +{ + plan skip_all => + "authentication tests cannot run without Unix-domain sockets"; +} # Delete pg_hba.conf from the given node, add a new entry to it # and then execute a reload to refresh it.
Re: [BUG] Logical replica crash if there was an error in a function.
Thanks a lot for the fast reply! On 03.11.2022 18:29, Tom Lane wrote: If we were just adding a query or two to an existing scenario, that could be okay; but spinning up and syncing a whole new primary and standby database is *expensive* when you multiply it by the number of times developers and buildfarm animals are going to run the tests in the future. On 15.11.2022 04:59, Tom Lane wrote: "Anton A. Melnikov" writes: On 02.11.2022 21:02, Tom Lane wrote: I don't think the cost of that test case is justified by the tiny probability that it'd ever catch anything. Seems it is possible to do a test without these remarks. The attached test uses existing nodes and checks the specific error message. My opinion remains unchanged: this would be a very poor use of test cycles. Sorry, i didn't fully understand what is required and added some functions to the test that spend extra cpu time. But i found that it is possible to make a test according to previous remarks by adding only a few extra queries to an existent test without any additional syncing. Experimentally, i could not observe any significant difference in CPU usage due to this test addition. The difference in the CPU usage was less than standard error. See 100_bugs-CPU-time.txt attached. Additionally i've tried to reduce overall number of nodes previously used in this test in a similar way. Optimizing existing tests isn't an answer to that. We could install those optimizations without adding a new test case. Oh sure! Here is a separate patch for this optimization: https://www.postgresql.org/message-id/eb7aa992-c2d7-6ce7-4942-0c784231a362%40inbox.ru On the contrary with previous case in that one the difference in the CPU usage during 100_bugs.pl is essential; it decreases approximately by 30%. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 2449158ba576d7c6d97852d14f85dadb3aced262 Author: Anton A. Melnikov Date: Wed Nov 16 11:46:54 2022 +0300 Add test for syntax error in the function in a a logical replication worker. See dea834938. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 6247aa7730..eb4ab6d359 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1'); pass('index predicates do not cause crash'); +# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru + +# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language +# CREATE FUNCTION or DO command executed in a logical replication worker, +# we'd suffer a null pointer dereference or assertion failure. + +$node_subscriber->safe_psql('postgres', q{ + create or replace procedure rebuild_test( + ) as + $body$ + declare + l_code text:= E'create or replace function public.test_selector( + ) returns setof public.tab1 as + \$body\$ + select * from error_name + \$body\$ language sql;'; + begin + execute l_code; + perform public.test_selector(); + end + $body$ language plpgsql; + create or replace function test_trg() + returns trigger as + $body$ + declare + begin + call public.rebuild_test(); + return NULL; + end + $body$ language plpgsql; + create trigger test_trigger after insert on tab1 for each row +execute function test_trg(); + alter table tab1 enable replica trigger test_trigger; +}); + +# This would crash on the subscriber if not fixed +$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)"); + +my $result = $node_subscriber->wait_for_log( + "ERROR: relation \"error_name\" does not exist at character" +); + +ok($result, + "ERROR: Logical decoding doesn't fail on function error"); + $node_publisher->stop('fast'); $node_subscriber->stop('fast'); ## src/test/subscription/100_bugs.pl Before adding test: Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 7.96 cusr 2.24 csys = 10.22 CPU) Files=1, Tests=7, 15 wallclock secs ( 0.02 usr 0.00 sys + 8.21 cusr 2.13 csys = 10.36 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.01 sys + 8.69 cusr 2.17 csys = 10.88 CPU) Files=1, Tests=7, 15 wallclock secs ( 0.01 usr 0.00 sys + 8.34 cusr 2.22 csys = 10.57 CPU) Files=1, Tests=7, 15 wallclock secs ( 0.01 usr 0.00 sys + 8.64 cusr 2.19 csys = 10.84 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.18 cusr 2.20 csys = 10.40 CPU) Files=1, Tests=7, 13 wallclock secs ( 0.01 usr 0.00 sys + 8.02 cusr 2.23 csys = 10.26 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.01 sys + 8.20 cusr 2.14 csys = 10.36 CPU) Files=1, Tests=7, 13 wallclock secs ( 0.02 usr 0.00 sys + 8.04 cusr 2.19 csys = 10.25 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.20 cusr 2.09 csys = 10.30 CPU) Avera
Make a 100_bugs.pl test more faster.
Hello! The previous discussion was here: https://www.postgresql.org/message-id/flat/b570c367-ba38-95f3-f62d-5f59b9808226%40inbox.ru On 15.11.2022 04:59, Tom Lane wrote: "Anton A. Melnikov" writes: Additionally i've tried to reduce overall number of nodes previously used in this test in a similar way. Optimizing existing tests isn't an answer to that. We could install those optimizations without adding a new test case. Here is a separate patch for the node usage optimization mentioned above. It decreases the CPU usage during 100_bugs.pl by about 30%. There are also some experimental data: 100_bugs-CPU-usage.txt Would be glad for any comments and concerns. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 72ea85b5f6a64f2c53e6a55455d134b228b6994b Author: Anton A. Melnikov Date: Mon Nov 14 08:30:26 2022 +0300 Reuse nodes in subscription/t/100_bugs.pl test to make in faster. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 6247aa7730..525584b2b6 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -81,9 +81,8 @@ $node_subscriber->stop('fast'); # identity set before accepting updates. If it did not it would cause # an error when an update was attempted. -$node_publisher = PostgreSQL::Test::Cluster->new('publisher2'); -$node_publisher->init(allows_streaming => 'logical'); -$node_publisher->start; +$node_publisher->rotate_logfile(); +$node_publisher->start(); $node_publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES"); @@ -226,13 +225,13 @@ $node_sub->stop('fast'); # target table's relcache was not being invalidated. This leads to skipping # UPDATE/DELETE operations during apply on the subscriber side as the columns # required to search corresponding rows won't get logged. -$node_publisher = PostgreSQL::Test::Cluster->new('publisher3'); -$node_publisher->init(allows_streaming => 'logical'); -$node_publisher->start; -$node_subscriber = PostgreSQL::Test::Cluster->new('subscriber3'); -$node_subscriber->init(allows_streaming => 'logical'); -$node_subscriber->start; +$node_publisher->rotate_logfile(); +$node_publisher->start(); + +$node_subscriber->rotate_logfile(); +$node_subscriber->rotate_logfile(); # call twice to synchronize lognames between master and replica +$node_subscriber->start(); $node_publisher->safe_psql('postgres', "CREATE TABLE tab_replidentity_index(a int not null, b int not null)"); ## src/test/subscription/100_bugs.pl Before reuse nodes: Files=1, Tests=7, 12 wallclock secs ( 0.02 usr 0.00 sys + 8.02 cusr 1.96 csys = 10.00 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 7.99 cusr 2.23 csys = 10.24 CPU) Files=1, Tests=7, 15 wallclock secs ( 0.01 usr 0.00 sys + 8.31 cusr 2.16 csys = 10.48 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.29 cusr 2.08 csys = 10.38 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.01 cusr 2.14 csys = 10.16 CPU) Files=1, Tests=7, 13 wallclock secs ( 0.02 usr 0.00 sys + 8.13 cusr 2.04 csys = 10.19 CPU) Files=1, Tests=7, 13 wallclock secs ( 0.02 usr 0.00 sys + 8.11 cusr 2.11 csys = 10.24 CPU) Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.04 cusr 2.19 csys = 10.25 CPU) Files=1, Tests=7, 12 wallclock secs ( 0.01 usr 0.00 sys + 8.02 cusr 2.07 csys = 10.10 CPU) Files=1, Tests=7, 13 wallclock secs ( 0.02 usr 0.00 sys + 8.05 cusr 2.09 csys = 10.16 CPU) Average CPU usage = 10.22+-0.04s After reuse nodes: Files=1, Tests=7, 11 wallclock secs ( 0.01 usr 0.00 sys + 5.68 cusr 1.56 csys = 7.25 CPU) Files=1, Tests=7, 11 wallclock secs ( 0.02 usr 0.00 sys + 5.61 cusr 1.60 csys = 7.23 CPU) Files=1, Tests=7, 11 wallclock secs ( 0.02 usr 0.00 sys + 5.66 cusr 1.64 csys = 7.32 CPU) Files=1, Tests=7, 12 wallclock secs ( 0.02 usr 0.00 sys + 5.76 cusr 1.63 csys = 7.41 CPU) Files=1, Tests=7, 11 wallclock secs ( 0.02 usr 0.00 sys + 5.60 cusr 1.61 csys = 7.23 CPU) Files=1, Tests=7, 10 wallclock secs ( 0.02 usr 0.00 sys + 5.63 cusr 1.65 csys = 7.30 CPU) Files=1, Tests=7, 10 wallclock secs ( 0.02 usr 0.00 sys + 5.64 cusr 1.58 csys = 7.24 CPU) Files=1, Tests=7, 12 wallclock secs ( 0.01 usr 0.00 sys + 5.59 cusr 1.75 csys = 7.35 CPU) Files=1, Tests=7, 11 wallclock secs ( 0.01 usr 0.00 sys + 5.54 cusr 1.74 csys = 7.29 CPU) Files=1, Tests=7, 10 wallclock secs ( 0.02 usr 0.00 sys + 5.62 cusr 1.57 csys = 7.21 CPU) Average CPU usage = 7.28+-0.02s
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! On 02.11.2022 21:02, Tom Lane wrote: So I'm now good with the idea of just not failing. I don't like the patch as presented though. First, the cfbot is quite rightly complaining about the "uninitialized variable" warning it draws. Second, I don't see a good reason to tie the change to logical replication in any way. Let's just change the Assert to an if(), as attached. Fully agreed that is the most optimal solution for that case. Thanks! Surely it's very rare one but there was a real segfault at production server. Someone came up with the idea to modify function like public.test_selector() in repcmd.sql (see above) on the fly with adding to it :last_modification: field from current time and some other parameters with the help of replace() inside the creation of the rebuild_test() procedure. On 03.11.2022 18:29, Tom Lane wrote: Amit Kapila writes: LGTM. I don't know if it is a good idea to omit the test case for this scenario. If required, we can reuse the test case from Sawada-San's patch in the email [1]. I don't think the cost of that test case is justified by the tiny probability that it'd ever catch anything. If we were just adding a query or two to an existing scenario, that could be okay; but spinning up and syncing a whole new primary and standby database is *expensive* when you multiply it by the number of times developers and buildfarm animals are going to run the tests in the future. There's also the little issue that I'm not sure it would actually detect a problem if we had one. The case is going to fail, and what we want to know is just how messily it fails, and I think the TAP infrastructure isn't very sensitive to that ... especially if the test isn't even checking for specific error messages. Seems it is possible to do a test without these remarks. The attached test uses existing nodes and checks the specific error message. Additionally i've tried to reduce overall number of nodes previously used in this test in a similar way. Would be glad for comments and remarks. With best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company commit 722fa6d8c629eb83b3d11ea88b49bab1f700b48d Author: Anton A. Melnikov Date: Mon Nov 14 08:30:26 2022 +0300 Add test for syntax error in the function in a a logical replication worker and combine some test nodes. diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index 6247aa7730..76a6c12cae 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -69,6 +69,9 @@ $node_publisher->wait_for_catchup('sub1'); pass('index predicates do not cause crash'); +$node_publisher->safe_psql('postgres', + "CHECKPOINT"); + $node_publisher->stop('fast'); $node_subscriber->stop('fast'); @@ -81,9 +84,8 @@ $node_subscriber->stop('fast'); # identity set before accepting updates. If it did not it would cause # an error when an update was attempted. -$node_publisher = PostgreSQL::Test::Cluster->new('publisher2'); -$node_publisher->init(allows_streaming => 'logical'); -$node_publisher->start; +$node_publisher->rotate_logfile(); +$node_publisher->start(); $node_publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES"); @@ -102,6 +104,9 @@ is( $node_publisher->psql( 'update to unlogged table without replica identity with FOR ALL TABLES publication' ); +$node_publisher->safe_psql('postgres', + "CHECKPOINT"); + $node_publisher->stop('fast'); # Bug #16643 - https://postgr.es/m/16643-eaadeb2a1a58d...@postgresql.org @@ -226,13 +231,13 @@ $node_sub->stop('fast'); # target table's relcache was not being invalidated. This leads to skipping # UPDATE/DELETE operations during apply on the subscriber side as the columns # required to search corresponding rows won't get logged. -$node_publisher = PostgreSQL::Test::Cluster->new('publisher3'); -$node_publisher->init(allows_streaming => 'logical'); -$node_publisher->start; -$node_subscriber = PostgreSQL::Test::Cluster->new('subscriber3'); -$node_subscriber->init(allows_streaming => 'logical'); -$node_subscriber->start; +$node_publisher->rotate_logfile(); +$node_publisher->start(); + +$node_subscriber->rotate_logfile(); +$node_subscriber->rotate_logfile(); # call twice to synchronize lognames between master and replica +$node_subscriber->start(); $node_publisher->safe_psql('postgres', "CREATE TABLE tab_replidentity_index(a int not null, b int not null)"); @@ -296,7 +301,89 @@ is( $node_subscriber->safe_psql( qq(-1|1), "update works with REPLICA IDENTITY"); +$node_publisher->safe_psql('postgres', + "CHECKPOINT"); + $node_publisher->stop('fast'); $node_subscriber->stop('fast'); +# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80
Re: [PATCH] Backport perl tests for pg_upgrade from 322becb60
Add backport to REL_14_STABLE. Unlike to the 13th version's one there are still some differences in the final dumps, eg during upgrade test 12->14. The similar differences present during upgrade test 12->master. Achieving zero dump diffs needs additional work, now in progress. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/src/bin/pg_upgrade/.gitignore b/src/bin/pg_upgrade/.gitignore index 2d3bfeaa50..05200a09f1 100644 --- a/src/bin/pg_upgrade/.gitignore +++ b/src/bin/pg_upgrade/.gitignore @@ -1,9 +1,4 @@ /pg_upgrade # Generated by test suite -/pg_upgrade_internal.log -/delete_old_cluster.sh -/delete_old_cluster.bat -/reindex_hash.sql -/loadable_libraries.txt /log/ /tmp_check/ diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile index 44d06be5a6..105199f182 100644 --- a/src/bin/pg_upgrade/Makefile +++ b/src/bin/pg_upgrade/Makefile @@ -28,6 +28,10 @@ OBJS = \ override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS) LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) +# required for 002_pg_upgrade.pl +REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX) +export REGRESS_SHLIB + all: pg_upgrade pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils @@ -44,22 +48,10 @@ uninstall: clean distclean maintainer-clean: rm -f pg_upgrade$(X) $(OBJS) - rm -rf delete_old_cluster.sh log/ tmp_check/ \ - loadable_libraries.txt reindex_hash.sql \ - pg_upgrade_dump_globals.sql \ - pg_upgrade_dump_*.custom pg_upgrade_*.log - -# When $(MAKE) is present, make automatically infers that this is a -# recursive make. which is not actually what we want here, as that -# e.g. prevents output synchronization from working (as make thinks -# that the subsidiary make knows how to deal with that itself, but -# we're invoking a shell script that doesn't know). Referencing -# $(MAKE) indirectly avoids that behaviour. -# See https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html#MAKE-Variable -NOTSUBMAKEMAKE=$(MAKE) + rm -rf log/ tmp_check/ -check: test.sh all temp-install - MAKE=$(NOTSUBMAKEMAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< +check: + $(prove_check) -# installcheck is not supported because there's no meaningful way to test -# pg_upgrade against a single already-running server +installcheck: + $(prove_installcheck) diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING index e69874b42d..200ce9d92b 100644 --- a/src/bin/pg_upgrade/TESTING +++ b/src/bin/pg_upgrade/TESTING @@ -2,25 +2,22 @@ THE SHORT VERSION - On non-Windows machines, you can execute the testing process -described below by running +described below by running the following command in this directory: make check -in this directory. This will run the shell script test.sh, performing -an upgrade from the version in this source tree to a new instance of -the same version. -To test an upgrade from a different version, you must have a built -source tree for the old version as well as this version, and you -must have done "make install" for both versions. Then do: +This will run the TAP tests to run pg_upgrade, performing an upgrade +from the version in this source tree to a new instance of the same +version. -export oldsrc=...somewhere/postgresql (old version's source tree) -export oldbindir=...otherversion/bin (old version's installed bin dir) -export bindir=...thisversion/bin (this version's installed bin dir) -export libdir=...thisversion/lib (this version's installed lib dir) -sh test.sh - -In this case, you will have to manually eyeball the resulting dump -diff for version-specific differences, as explained below. +Testing an upgrade from a different version requires a dump to set up +the contents of this instance, with its set of binaries. The following +variables are available to control the test (see DETAILS below about +the creation of the dump): +export olddump=...somewhere/dump.sql (old version's dump) +export oldinstall=...otherversion/ (old version's install base path) +Finally, the tests can be done by running + make check DETAILS --- @@ -29,61 +26,22 @@ The most effective way to test pg_upgrade, aside from testing on user data, is by upgrading the PostgreSQL regression database. This testing process first requires the creation of a valid regression -database dump. Such files contain most database features and are -specific to each major version of Postgres. +database dump that can be then used for $olddump. Such files contain +most database features and are specific to each major version of Postgres. -Here are the steps needed to
Re: Question about "compound" queries.
Thanks a lot for the reply and timely help! On 25.10.2022 01:36, David G. Johnston wrote: I suspect they came about out of simplicity - being able to simply take a text file with a bunch of SQL commands in a script and send them as-is to the server without any client-side parsing and let the server just deal with it. It works because the system needs to do those kinds of things anyway so, why not make it user-facing, even if most uses would find its restrictions makes it undesirable to use. David J. All the best, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Question about "compound" queries.
Hello! Please, could somebody explain what the "compound" queries were created for? Maybe i'm calling them wrong. It's about queries like: SELECT 1 + 2 \; SELECT 2.0 AS "float" \; SELECT 1; Such queries can neither be prepared nor used in the extended protocol with ERROR: cannot insert multiple commands into a prepared statement. What are their advantages? And what is the proper name for such queries? "Compound" or something else? Would be very grateful for clarification. Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: effective_multixact_freeze_max_age issue
Hello! On 18.10.2022 20:56, Peter Geoghegan wrote: The term "removable cutoff" is how VACUUM VERBOSE reports OldestXmin. I think that it's good to use the same terminology here. Thanks for the explanation! Firstly exactly this term confused me. Sure, the same terminology makes all easier to understand. Could you clarify this moment please? Would be very grateful. While this WARNING is triggered when a threshold controlled by autovacuum_freeze_max_age is crossed, it's not just a problem with freezing. It's convenient to use autovacuum_freeze_max_age to represent "a wildly excessive number of XIDs for OldestXmin to be held back by, no matter what". In practice it is usually already a big problem when OldestXmin is held back by far fewer XIDs than that, but it's hard to reason about when exactly we need to consider that a problem. However, we can at least be 100% sure of real problems when OldestXmin age reaches autovacuum_freeze_max_age. There is no longer any doubt that we need to warn the user, since antiwraparound autovacuum cannot work as designed at that point. But the WARNING is nevertheless not primarily (or not exclusively) about not being able to freeze. It's also about not being able to remove bloat.> Freezing can be thought of as roughly the opposite process to removing dead tuples deleted by now committed transactions. OldestXmin is the cutoff both for removing dead tuples (which we want to get rid of immediately), and freezing live all-visible tuples (which we want to keep long term). FreezeLimit is usually 50 million XIDs before OldestXmin (the freeze_min_age default), but that's just how we implement lazy freezing, which is just an optimization. That's clear. Thanks a lot! As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? The patch attached tries to do this. I don't think that this is an improvement. For one thing the FreezeLimit cutoff will have been held back (relative to nextXID-wise table age) by more than the freeze_min_age setting for a long time before this WARNING is hit -- so we're not going to show the WARNING in most cases where freeze_min_age has been completely ignored (it must be ignored in extreme cases because FreezeLimit must always be <= OldestXmin). Also, the proposed new WARNING is only seen when we're bound to also see the existing OldestXmin WARNING already. Why have 2 WARNINGs about exactly the same problem?> I didn't understand this moment. If the FreezeLimit is always older than OldestXmin or equal to it according to: FreezeLimit is usually 50 million XIDs before OldestXmin (the freeze_min_age default), can't there be a situation like this? __ | autovacuum_freeze_max_age | past <___||_||> future FreezeLimit safeOldestXmin oldestXmin nextXID |___| freeze_min_age In that case the existing OldestXmin WARNING will not fire while the new one will. As the FreezeLimit is only optimization it's likely not a warning but a notice message before OldestXmin WARNING and possible real problems in the future. Maybe it can be useful in a such kind? With best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: effective_multixact_freeze_max_age issue
Hello! On 31.08.2022 21:38, Peter Geoghegan wrote: On Tue, Aug 30, 2022 at 8:56 PM Nathan Bossart wrote: LGTM Pushed, thanks. In this commit https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c there are checks if oldestXmin and oldestMxact havn't become too far in the past. But the corresponding error messages say also some different things about 'cutoff for freezing tuples', ie about checks for another variables: freezeLimit and multiXactCutoff. See: https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c?diff=split#diff-795a3938e3bed9884d426bd010670fe505580732df7d7012fad9edeb9df54badR1075 and https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c?diff=split#diff-795a3938e3bed9884d426bd010670fe505580732df7d7012fad9edeb9df54badR1080 It's interesting that prior to this commit, checks were made for freeze limits, but the error messages were talking about oldestXmin and oldestMxact. Could you clarify this moment please? Would be very grateful. As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? The patch attached tries to do this. -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit f27ea4e61a7680452b56a7c11b1dcab1c0b81c6b Author: Anton A. Melnikov Date: Fri Oct 14 11:57:22 2022 +0300 Split 'too far in the past checks' in vacuum_set_xid_limits(). diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 7ccde07de9..0bbeafba49 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1072,14 +1072,23 @@ vacuum_set_xid_limits(Relation rel, safeOldestMxact = FirstMultiXactId; if (TransactionIdPrecedes(*oldestXmin, safeOldestXmin)) ereport(WARNING, -(errmsg("cutoff for removing and freezing tuples is far in the past"), +(errmsg("oldest xmin is far in the past"), errhint("Close open transactions soon to avoid wraparound problems.\n" "You might also need to commit or roll back old prepared transactions, or drop stale replication slots."))); if (MultiXactIdPrecedes(*oldestMxact, safeOldestMxact)) ereport(WARNING, -(errmsg("cutoff for freezing multixacts is far in the past"), +(errmsg("oldest multixact is far in the past"), + errhint("Close open transactions with multixacts soon to avoid wraparound problems.\n"))); + + if (TransactionIdPrecedes(*freezeLimit, safeOldestXmin)) + ereport(WARNING, +(errmsg("cutoff for freezing tuples is far in the past"), errhint("Close open transactions soon to avoid wraparound problems.\n" "You might also need to commit or roll back old prepared transactions, or drop stale replication slots."))); + if (MultiXactIdPrecedes(*multiXactCutoff, safeOldestMxact)) + ereport(WARNING, +(errmsg("cutoff for freezing multixacts is far in the past"), + errhint("Close open transactions with multixacts soon to avoid wraparound problems.\n"))); /* * Finally, figure out if caller needs to do an aggressive VACUUM or not.
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! Thanks for reply! On 24.09.2022 20:27, Tom Lane wrote: I think you're solving the problem in the wrong place. The real issue is why are we not setting up ActivePortal correctly when running user-defined code in a logrep worker? During a common query from the backend ActivePortal becomes defined in the PortalRun and then AfterTriggerEndQuery starts with non-NULL ActivePortal after ExecutorFinish. When the logrep worker is applying messages there are neither PortalStart nor PortalRun calls. And AfterTriggerEndQuery starts with undefined ActivePortal after finish-edata(). May be it's normal behavior? There is other code that expects that to be set, eg EnsurePortalSnapshotExists. When the logrep worker is applying message it doesn't have to use ActivePortal in EnsurePortalSnapshotExists because ActiveSnapshot is already installed. It is set at the beginning of each transaction in the begin_replication_step call. On the other hand, function_parse_error_transpose() tries to get the original query text (INSERT INTO test VALUES ('1') in our case) from the ActivePortal to clarify the location of the error. But in the logrep worker there is no way to restore original query text from the logrep message. There is only 'zipped' query equivalent to the original. So any function_parse_error_transpose() call seems to be useless in the logrep worker. And it looks like we can simply omit match_prosrc_to_query() call there. The attached patch does it. Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit bfaa02ac7a7cbeb793747be71962a7799c60c21c Author: Anton A. Melnikov Date: Sun Oct 9 11:56:20 2022 +0300 Fix logical replica crash if there was an error in a user function. diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index a9fe45e347..1e8f1b1097 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -36,6 +36,7 @@ #include "parser/parse_coerce.h" #include "parser/parse_type.h" #include "pgstat.h" +#include "replication/logicalworker.h" #include "rewrite/rewriteHandler.h" #include "tcop/pquery.h" #include "tcop/tcopprot.h" @@ -1034,12 +1035,20 @@ function_parse_error_transpose(const char *prosrc) return false; } - /* We can get the original query text from the active portal (hack...) */ - Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE); - queryText = ActivePortal->sourceText; + /* + * In the logical replication worker there is no way to restore original + * query text from the logical replication message. There is only 'zipped' + * query equivalent to the original text. + */ + if (!IsLogicalWorker()) + { + /* We can get the original query text from the active portal (hack...) */ + Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE); + queryText = ActivePortal->sourceText; - /* Try to locate the prosrc in the original text */ - newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + /* Try to locate the prosrc in the original text */ + newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + } if (newerrposition > 0) {
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Hello! Thank you very much for your feedback and essential remarks. On 07.09.2022 10:39, Kyotaro Horiguchi wrote: It lets XLogPageRead run the same check with what CreateRestartPoint does, so it basically works (it is forgetting a lock, though. The reason for omitting the lock in CreateRestartPoint is that it knows checkopinter is the only updator of the shared variable.). I'm not sure I like that for the code duplication. I'm not sure we need to fix that but if we do that, I would impletent IsNewCheckPointWALRecs() using XLogCtl->RedoRecPtr and XLogCtl->lastCheckPoint.redo instead since they are protected by the same lock, and they work more correct way, that is, that can avoid restartpoint requests while the last checkpoint is running. And I would rename it as RestartPointAvailable() or something like that. Corrected patch is attached (v2-0001-Fix-burst-checkpoint_req-growth.patch). The access to Controlfile was removed so lwlock seems to be not needed. Some logic duplication is still present and and i'm not quite sure if it's possible to get rid of it. Would be glad to any suggestions. Or I might want to add XLogRestartpointNeeded(readSegNo) to reduce the required number of info_lck by reading XLogCtl members at once. If we place this check into the XLogCheckpointNeeded() this will lead to a double take of info_lck in XLogPageRead() when the restartpoint request is forming. As it's done now taking of info_lck will be more rarely. It seems i probably didn't understand your idea, please clarify it for me. Depends on how we see the counter value. I think this can be annoying but not a bug. CreateRestartPoint already handles that case. Yes! It is in fact annoying as docs says that checkpoint_req counts "the number of requested checkpoints that have been performed". But really checkpoints_req counts both the number of checkpoints requests and restartpoint ones which may not be performed and resources are not spent. The second frightening factor is the several times faster growth of the checkpoints_timed counter on the replica vs primary due to scheduling replays in 15 second if an attempt to create the restartpoint failed. Here is a patch that leaves all logic as is, but adds a stats about restartpoints. (v1-0001-Add-restartpoint-stats.patch) . For instance, for the same period on primary with this patch: # SELECT CURRENT_TIME; select * from pg_stat_bgwriter \gx current_time 00:19:15.794561+03 (1 row) -[ RECORD 1 ]-+- checkpoints_timed | 4 checkpoints_req | 10 restartpoints_timed | 0 restartpoints_req | 0 restartpoints_done| 0 On replica: # SELECT CURRENT_TIME; select * from pg_stat_bgwriter \gx current_time 00:19:11.363009+03 (1 row) -[ RECORD 1 ]-+-- checkpoints_timed | 0 checkpoints_req | 0 restartpoints_timed | 42 restartpoints_req | 67 restartpoints_done| 10 Only the counters checkpoints_timed, checkpoints_req and restartpoints_done give the indication of resource-intensive operations. Without this patch, the user would see on the replica something like this: checkpoints_timed | 42 checkpoints_req | 109 With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit d5a58d8585692be0d24db9414859088e3e7f7dad Author: Anton A. Melnikov Date: Tue Sep 6 12:18:56 2022 +0300 Remove burst growth of the checkpoint_req on replica. with correcttions according to comment: https://www.postgresql.org/message-id/20220907.163946.2214338139097492905.horikyota.ntt%40gmail.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 81d339d57d..3ed1a87943 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9054,3 +9054,20 @@ SetWalWriterSleeping(bool sleeping) XLogCtl->WalWriterSleeping = sleeping; SpinLockRelease(>info_lck); } + +/* + * Check if a new checkpoint WAL record has been received since the + * last restartpoint. So it is possible to create a new one. + */ +bool RestartPointAvailable(void) +{ + bool result = false; + + SpinLockAcquire(>info_lck); + if (!XLogRecPtrIsInvalid(XLogCtl->lastCheckPointRecPtr) + && XLogCtl->lastCheckPoint.redo > XLogCtl->RedoRecPtr) +result = true; + SpinLockRelease(>info_lck); + + return result; +} diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index b41e682664..7236e0f0a4 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3199,7 +3199,15 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, { (void) GetRedoRecPtr(); if (XLogCheckpointNeeded(readSegNo)) - RequestCheckpoint(CHECKPOINT_CAUSE_XLOG); +{ + /* + *
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! Added a TAP test for this case. On 30.08.2022 10:09, Anton A. Melnikov wrote: Hello! The patch was rebased on current master. And here is a simplified crash reproduction: 1) On primary with 'wal_level = logical' execute: CREATE TABLE public.test (id int NOT NULL, val integer); CREATE PUBLICATION test_pub FOR TABLE test; 2) On replica replace in the repcmd.sql attached with primary port and execute it: psql -f repcmd.sql 3) On master execute command: INSERT INTO test VALUES ('1'); With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit d539e1e36ef7af33e1a89eaee00183739c716797 Author: Anton A. Melnikov Date: Sun Aug 21 18:27:44 2022 +0300 Fix logical replica crash if there was an error in a function. diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index a9fe45e347..1381fae575 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -1007,8 +1007,9 @@ sql_function_parse_error_callback(void *arg) * anonymous-block handler, not only for SQL-language functions. * It is assumed that the syntax error position is initially relative to the * function body string (as passed in). If possible, we adjust the position - * to reference the original command text; if we can't manage that, we set - * up an "internal query" syntax error instead. + * to reference the original command text; if we can't manage that or + * can't get the original command text when ActivePortal is not defined, + * we set up an "internal query" syntax error instead. * * Returns true if a syntax error was processed, false if not. */ @@ -1016,7 +1017,7 @@ bool function_parse_error_transpose(const char *prosrc) { int origerrposition; - int newerrposition; + int newerrposition = 0; const char *queryText; /* @@ -1034,12 +1035,15 @@ function_parse_error_transpose(const char *prosrc) return false; } - /* We can get the original query text from the active portal (hack...) */ - Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE); - queryText = ActivePortal->sourceText; + if (ActivePortal) + { + /* We can get the original query text from the active portal (hack...) */ + Assert(ActivePortal->status == PORTAL_ACTIVE); + queryText = ActivePortal->sourceText; - /* Try to locate the prosrc in the original text */ - newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + /* Try to locate the prosrc in the original text */ + newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + } if (newerrposition > 0) { @@ -1052,7 +1056,8 @@ function_parse_error_transpose(const char *prosrc) else { /* - * If unsuccessful, convert the position to an internal position + * If unsuccessful or ActivePortal not defined and original command + * text is unreachable, convert the position to an internal position * marker and give the function text as the internal query. */ errposition(0); diff --git a/src/test/recovery/t/034_logical_replica_on_error.pl b/src/test/recovery/t/034_logical_replica_on_error.pl new file mode 100644 index 00..380ad74590 --- /dev/null +++ b/src/test/recovery/t/034_logical_replica_on_error.pl @@ -0,0 +1,105 @@ +# Copyright (c) 2022, Postgres Professional + +# There was an assertion if function text contains an error. See PGPRO-7025 +# Тhis file has a prefix (120_) to prevent prefix collision with +# upstream test files. + +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More tests => 2; + +# Initialize primary node +my $node_primary = PostgreSQL::Test::Cluster->new('primary'); +$node_primary->init(allows_streaming => 1); +$node_primary->append_conf( + 'postgresql.conf', qq( +wal_level = logical +)); +$node_primary->start; + +$node_primary->safe_psql('postgres', + 'CREATE TABLE public.test (id int NOT NULL, val integer);'); + +$node_primary->safe_psql('postgres', + 'CREATE PUBLICATION test_pub FOR TABLE test;'); + +# Initialize logical replica node +my $node_replica = PostgreSQL::Test::Cluster->new('replica'); +$node_replica->init(has_streaming => 1, + has_restoring => 1); +$node_replica->start; + +$node_replica->safe_psql('postgres', + 'CREATE TABLE test (id int NOT NULL, val integer);'); + +$node_replica->safe_psql('postgres', q{ + create or replace procedure rebuild_test( + ) as + $body$ + declare + l_code text:= E'create or replace function public.test_selector( + ) returns setof public.test as + \$body\$ + select * from error_name + \$body\$ language sql;'; + begin + execute l_code; + perform public.test_selector(); + end + $body$ language plpgsql; +}); + +$node_replica->safe_psql('postgres', ' + create or replace function test_trg() + returns trigger as + $body$ + declare + begin + call public.rebuild_test(); +
May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Hello! Found a periodic spike growth of the checkpoint_req counter on replica by 20-30 units after large insert (~350Mb) on master. Reproduction on master and replica with default conf: 1) execute the command "insert into test values (generate_series(1,1E7));". This leads to the table's growth by about 350Mb during about 15 secs (on my pc). 2)The wal records start coming to the replica, and when their number exceeds a certain limit, a request is emitted to the checkpointer process to create restartpoint on the replica and checkpoint_req is incremented. With default settings, this limit is 42 segments. 3) Restartpoint creation fails because a new restartpoint can only be created if the replica has received new WAL records about the checkpoint from the moment of the previous restartpoint. But there were no such records. 4) When the next WAL segment is received by replica, the next request is generated to create a restartpoint on the replica, and so on. 5) Finally, a WAL record about the checkpoint arrives on the replica, restartpoint is created and the growth of checkpoint_req stops. The described process can be observed in the log with additional debugging. See insert_1E7_once.log attached. This log is for v13 but master has the same behavior. Can we treat such behavior as a bug? If so it seems possible to check if a creating of restartpoint is obviously impossible before sending request and don't send it at all if so. The patch applied tries to fix it. With best regards. -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company022-09-04 21:09:45.160 MSK [2424110] LOG: Start CFS version 0.54 supported compression algorithms pglz,zlib encryption disabled GC enabled 2022-09-04 21:09:45.168 MSK [2424110] LOG: starting PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit 2022-09-04 21:09:45.168 MSK [2424110] LOG: listening on IPv4 address "0.0.0.0", port 54131 2022-09-04 21:09:45.168 MSK [2424110] LOG: listening on IPv6 address "::", port 54131 2022-09-04 21:09:45.177 MSK [2424110] LOG: listening on Unix socket "/tmp/.s.PGSQL.54131" 2022-09-04 21:09:45.187 MSK [2424150] LOG: database system was interrupted; last known up at 2022-09-04 21:09:44 MSK 2022-09-04 21:09:45.282 MSK [2424150] LOG: entering standby mode 2022-09-04 21:09:45.292 MSK [2424150] LOG: redo starts at 0/228 2022-09-04 21:09:45.296 MSK [2424150] LOG: consistent recovery state reached at 0/2000198 2022-09-04 21:09:45.296 MSK [2424110] LOG: database system is ready to accept read only connections 2022-09-04 21:09:45.307 MSK [2424233] LOG: started streaming WAL from primary at 0/300 on timeline 1 2022-09-04 21:09:45.341 MSK [2424259] LOG: Start 1 background garbage collection workers for CFS 2022-09-04 21:10:26.653 MSK [2424150] LOG: &&&&&&&&&&&&&&&&!!!Request a restartpoint if we've replayed too much xlog!!! readSegNo: 43, old_segno: 2, CheckPointSegments: 42 2022-09-04 21:10:26.653 MSK [2424224] LOG: Checkpoint requested by wal. Checkpoints already started: 0, done: 0, failed: 0. Now: 41. Last_chkpt_time: 0. Elapsed secs: 41. CheckPointTimeout: 300. CheckPointWarning: 30 2022-09-04 21:10:26.653 MSK [2424224] LOG: !!!This is restartpoint!!! 2022-09-04 21:10:26.669 MSK [2424224] LOG: !!!Restartpoint NOT performed!!! 2022-09-04 21:10:28.926 MSK [2424150] LOG: &&&&&&&&&&&&&&&&!!!Request a restartpoint if we've replayed too much xlog!!! readSegNo: 44, old_segno: 2, CheckPointSegments: 42 2022-09-04 21:10:28.926 MSK [2424224] LOG: Checkpoint requested by wal. Checkpoints already started: 1, done: 1, failed: 0. Now: 43. Last_chkpt_time: -244. Elapsed secs: 287. CheckPointTimeout: 300. CheckPointWarning: 30 2022-09-04 21:10:28.926 MSK [2424224] LOG: !!!This is restartpoint!!! 2022-09-04 21:10:29.176 MSK [2424224] LOG: !!!Restartpoint NOT performed!!! 2022-09-04 21:10:30.014 MSK [2424150] LOG: &&&&&&&&&&&&&&&&!!!Request a restartpoint if we've replayed too much xlog!!! readSegNo: 45, old_segno: 2, CheckPointSegments: 42 2022-09-04 21:10:30.014 MSK [2424224] LOG: Checkpoint requested by wal. Checkpoints already started: 2, done: 2, failed: 0. Now: 45. Last_chkpt_time: -242. Elapsed secs: 287. CheckPointTimeout: 300. CheckPointWarning: 30 2022-09-04 21:10:30.014 MSK [2424224] LOG: !!!This is restartpoint!!! 2022-09-04 21:10:30.058 MSK [2424224] LOG: !!!Restartpoint NOT performed!!! 2022-09-04 21:10:45.072 MSK [2424224] LOG: Checkpoint requested by time! Now: 60. Last_chkpt_time: -240. Elapsed secs: 300. CheckPointTimeout: 300. CheckPointWarning: 30 2022-09-04 21:10:45.072 MSK [2424224] LOG: !!!This is restartpoint!!! 2022-09-04 21:10:45.077
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! The patch was rebased on current master. And here is a simplified crash reproduction: 1) On primary with 'wal_level = logical' execute: CREATE TABLE public.test (id int NOT NULL, val integer); CREATE PUBLICATION test_pub FOR TABLE test; 2) On replica replace in the repcmd.sql attached with primary port and execute it: psql -f repcmd.sql 3) On master execute command: INSERT INTO test VALUES ('1'); With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 4de66e1b1ffaeaacdd72f3e72789ca05b114476b Author: Anton A. Melnikov Date: Sun Aug 21 18:27:44 2022 +0300 Fix logical replica crash if there was an error in a function. diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index a9fe45e347..1381fae575 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -1007,8 +1007,9 @@ sql_function_parse_error_callback(void *arg) * anonymous-block handler, not only for SQL-language functions. * It is assumed that the syntax error position is initially relative to the * function body string (as passed in). If possible, we adjust the position - * to reference the original command text; if we can't manage that, we set - * up an "internal query" syntax error instead. + * to reference the original command text; if we can't manage that or + * can't get the original command text when ActivePortal is not defined, + * we set up an "internal query" syntax error instead. * * Returns true if a syntax error was processed, false if not. */ @@ -1016,7 +1017,7 @@ bool function_parse_error_transpose(const char *prosrc) { int origerrposition; - int newerrposition; + int newerrposition = 0; const char *queryText; /* @@ -1034,12 +1035,15 @@ function_parse_error_transpose(const char *prosrc) return false; } - /* We can get the original query text from the active portal (hack...) */ - Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE); - queryText = ActivePortal->sourceText; + if (ActivePortal) + { + /* We can get the original query text from the active portal (hack...) */ + Assert(ActivePortal->status == PORTAL_ACTIVE); + queryText = ActivePortal->sourceText; - /* Try to locate the prosrc in the original text */ - newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + /* Try to locate the prosrc in the original text */ + newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + } if (newerrposition > 0) { @@ -1052,7 +1056,8 @@ function_parse_error_transpose(const char *prosrc) else { /* - * If unsuccessful, convert the position to an internal position + * If unsuccessful or ActivePortal not defined and original command + * text is unreachable, convert the position to an internal position * marker and give the function text as the internal query. */ errposition(0); repcmd.sql Description: application/sql
Re: [BUG] Logical replica crash if there was an error in a function.
Hello! On 21.08.2022 17:33, Anton A. Melnikov wrote: Hello! Here is a fix for the bug first described in: https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru Sorry, there was a wrong patch in the first letter. Here is a right version. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 9bff84bda455609820c259bdc47d200bebcba7ab Author: Anton A. Melnikov Date: Sun Aug 21 18:27:44 2022 +0300 Fix logical replica crash if there was an error in a function. diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index a9fe45e347..1381fae575 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -1007,8 +1007,9 @@ sql_function_parse_error_callback(void *arg) * anonymous-block handler, not only for SQL-language functions. * It is assumed that the syntax error position is initially relative to the * function body string (as passed in). If possible, we adjust the position - * to reference the original command text; if we can't manage that, we set - * up an "internal query" syntax error instead. + * to reference the original command text; if we can't manage that or + * can't get the original command text when ActivePortal is not defined, + * we set up an "internal query" syntax error instead. * * Returns true if a syntax error was processed, false if not. */ @@ -1016,7 +1017,7 @@ bool function_parse_error_transpose(const char *prosrc) { int origerrposition; - int newerrposition; + int newerrposition = 0; const char *queryText; /* @@ -1034,12 +1035,15 @@ function_parse_error_transpose(const char *prosrc) return false; } - /* We can get the original query text from the active portal (hack...) */ - Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE); - queryText = ActivePortal->sourceText; + if (ActivePortal) + { + /* We can get the original query text from the active portal (hack...) */ + Assert(ActivePortal->status == PORTAL_ACTIVE); + queryText = ActivePortal->sourceText; - /* Try to locate the prosrc in the original text */ - newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + /* Try to locate the prosrc in the original text */ + newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition); + } if (newerrposition > 0) { @@ -1052,7 +1056,8 @@ function_parse_error_transpose(const char *prosrc) else { /* - * If unsuccessful, convert the position to an internal position + * If unsuccessful or ActivePortal not defined and original command + * text is unreachable, convert the position to an internal position * marker and give the function text as the internal query. */ errposition(0);
[BUG] Logical replica crash if there was an error in a function.
Hello! Here is a fix for the bug first described in: https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru Reproduction: 1) On master with 'wal_level = logical' execute mascmd.sql attached. 2) On replica substitute the correct port in repcmd.sql and execute it. 3) On master execute command: INSERT INTO rul_rule_set VALUES ('1', 'name','1','age','true'); Replica will crash with: FailedAssertion("ActivePortal && ActivePortal->status == PORTAL_ACTIVE", File: "pg_proc.c", Line: 1038, PID: 42894) in infinite loop. After applying this patch replica will give the correct error message instead of assertion: 2022-08-21 17:08:39.935 MSK [143171] ERROR: relation "rul_rule_set" does not exist at character 172 2022-08-21 17:08:39.935 MSK [143171] QUERY: -- Last modified: 2022-08-21 17:08:39.930842+03 with parameters as ( <<--- skipped by me --- >>> ) 2022-08-21 17:08:39.935 MSK [143171] CONTEXT: SQL statement "create or replace function public.rule_set_selector( <<--- skipped by me --- >>> SQL statement "call public.rebuild_rule_set_selector()" PL/pgSQL function public.rul_rule_set_trg() line 4 at CALL processing remote data for replication origin "pg_16401" during "INSERT" for replication target relation "public.rul_rule_set" in transaction 741 finished at 0/17BE180 With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company mascmd.sql Description: application/sql repcmd.sql Description: application/sql commit 585d0cd944d952f08f7649d02f1d6b6644e93611 Author: Peter Eisentraut Date: Sat Aug 20 20:48:47 2022 +0200 Remove dummyret definition This hasn't been used in a while (last use removed by 50d22de932, and before that 84b6d5f359), and since we are now preferring inline functions over complex macros, it's unlikely to be needed again. Reviewed-by: Daniel Gustafsson Discussion: https://www.postgresql.org/message-id/flat/7110ab37-8ddd-437f-905c-6aa6205c6185%40enterprisedb.com diff --git a/src/include/c.h b/src/include/c.h index 65e91a6b89..dfc366b026 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -333,16 +333,6 @@ _61,_62,_63, N, ...) \ (N) -/* - * dummyret is used to set return values in macros that use ?: to make - * assignments. gcc wants these to be void, other compilers like char - */ -#ifdef __GNUC__ /* GNU cc */ -#define dummyret void -#else -#define dummyret char -#endif - /* * Generic function pointer. This can be used in the rare cases where it's * necessary to cast a function pointer to a seemingly incompatible function
Re: [PATCH] Fix pg_upgrade test from v10
Hello! On 06.07.2022 08:58, Michael Paquier wrote: That's the kind of things I already proposed on this thread, aimed at improving the coverage, and this takes care of more issues than what's proposed here: https://www.postgresql.org/message-id/flat/Yox1ME99GhAemMq1(at)paquier(dot)xyz I'll rebase my patch to include fixes for --wal-segsize and --allow-group-access when using versions older than v11. -- Michael Thanks! I looked at this thread and tried to apply some changes from it in practice. And found one strange error and describe it in a comment here: https://www.postgresql.org/message-id/cc7e961a-d5ad-8c6d-574b-478aacc11cf7%40inbox.ru It would be interesting to know if it occures on my PC only or somewhere else. On 05.07.2022 22:08, Justin Pryzby wrote: ..since it tries to apply all the *.patch files to the master branch, one after another. For branches other than master, I suggest to name the patches *.txt or similar. Or, just focus for now on allowing upgrades *to* master. I'm not sure if anyone is interested in patching test.sh in backbranches. I'm not sure, but there may be more interest to backpatch the conversion to TAP (322becb60). Yes, the backport idea seems to be interesting. I wrote more about this in a new thread: https://www.postgresql.org/message-id/e2b1f3a0-4fda-ba72-5535-2d0395b9e68f%40inbox.ru as the current topic has nothing to do with the backport of TAP tests. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
[PATCH] Backport perl tests for pg_upgrade from 322becb60
Hello! In previous discussion (https://www.postgresql.org/message-id/flat/6b05291c-f252-4fae-317d-b50dba69c311%40inbox.ru) On 05.07.2022 22:08, Justin Pryzby wrote: I'm not sure if anyone is interested in patching test.sh in backbranches. I'm not sure, but there may be more interest to backpatch the conversion to TAP (322becb60). As far as i understand from this thread: https://www.postgresql.org/message-id/flat/Yox1ME99GhAemMq1%40paquier.xyz, the aim of the perl version for the pg_upgrade tests is to achieve equality of dumps for most cross-versions cases. If so this is the significant improvement as previously in test.sh resulted dumps retained unequal and the user was asked to eyeball them manually during cross upgrades between different major versions. So, the backport of the perl tests also seems preferable to me. In the attached patch has a backport to REL_13_STABLE. It has been tested from 9.2+ and give zero dumps diff from 10+. Also i've backported b34ca595, ba15f161, 95c3a195, 4c4eaf3d and b3983888 to reduce changes in the 002_pg_upgrade.pl and b33259e2 to fix an error when upgrading from 9.6. Dumps filtering and some other changes were backported from thread https://www.postgresql.org/message-id/flat/Yox1ME99GhAemMq1%40paquier.xyz too. Would be very grateful for comments and suggestions before trying to do this for other versions. I have a some question concerning patch tester. As Justin said it fails on non-master patches since it tries to apply all the *.patch files to the master branch, one after another. For branches other than master, I suggest to name the patches *.txt or similar. So, i made a .txt extension for patch, but i would really like to set a patch tester on it. Is there any way to do this? With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/src/bin/pg_upgrade/.gitignore b/src/bin/pg_upgrade/.gitignore index 9edea5c98f..05200a09f1 100644 --- a/src/bin/pg_upgrade/.gitignore +++ b/src/bin/pg_upgrade/.gitignore @@ -1,11 +1,4 @@ /pg_upgrade # Generated by test suite -/pg_upgrade_internal.log -/analyze_new_cluster.sh -/delete_old_cluster.sh -/analyze_new_cluster.bat -/delete_old_cluster.bat -/reindex_hash.sql -/loadable_libraries.txt /log/ /tmp_check/ diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile index 0360c37bf9..105199f182 100644 --- a/src/bin/pg_upgrade/Makefile +++ b/src/bin/pg_upgrade/Makefile @@ -28,6 +28,10 @@ OBJS = \ override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS) LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) +# required for 002_pg_upgrade.pl +REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX) +export REGRESS_SHLIB + all: pg_upgrade pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils @@ -44,22 +48,10 @@ uninstall: clean distclean maintainer-clean: rm -f pg_upgrade$(X) $(OBJS) - rm -rf analyze_new_cluster.sh delete_old_cluster.sh log/ tmp_check/ \ - loadable_libraries.txt reindex_hash.sql \ - pg_upgrade_dump_globals.sql \ - pg_upgrade_dump_*.custom pg_upgrade_*.log - -# When $(MAKE) is present, make automatically infers that this is a -# recursive make. which is not actually what we want here, as that -# e.g. prevents output synchronization from working (as make thinks -# that the subsidiary make knows how to deal with that itself, but -# we're invoking a shell script that doesn't know). Referencing -# $(MAKE) indirectly avoids that behaviour. -# See https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html#MAKE-Variable -NOTSUBMAKEMAKE=$(MAKE) + rm -rf log/ tmp_check/ -check: test.sh all temp-install - MAKE=$(NOTSUBMAKEMAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< +check: + $(prove_check) -# installcheck is not supported because there's no meaningful way to test -# pg_upgrade against a single already-running server +installcheck: + $(prove_installcheck) diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING index e69874b42d..200ce9d92b 100644 --- a/src/bin/pg_upgrade/TESTING +++ b/src/bin/pg_upgrade/TESTING @@ -2,25 +2,22 @@ THE SHORT VERSION - On non-Windows machines, you can execute the testing process -described below by running +described below by running the following command in this directory: make check -in this directory. This will run the shell script test.sh, performing -an upgrade from the version in this source tree to a new instance of -the same version. -To test an upgrade from a different version, you must have a built -source tree for the old version as well as this version, and you -must have done "make install" for both versions. Then do: +T
Re: Improve TAP tests of pg_upgrade for cross-version tests
Hello! On 30.07.2022 10:29, Michael Paquier wrote: [ - 'psql', '-X', + "$newbindir/psql", '-X', Found that adding $newbindir to psql gives an error when upgrading from versions 14 and below to master when the test tries to run upgrade_adapt.sql script: t/002_pg_upgrade.pl .. 1/? # Failed test 'ran adapt script' # at t/002_pg_upgrade.pl line 141. in regress_log_002_pg_upgrade: # Running: <$newbindir>/psql -X -f <$srcdir>/src/bin/pg_upgrade/upgrade_adapt.sql regression <$newbindir>/psql: symbol lookup error: <$newbindir>/psql: undefined symbol: PQmblenBounded Tests from 15-stable and from itself work as expected. Question about similar error was here: https://www.postgresql.org/message-id/flat/BN0PR20MB3912AA107FA6E90FB6B0A034FD9F9%40BN0PR20MB3912.namprd20.prod.outlook.com With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Possible fails in pg_stat_statements test
On 06.07.2022 20:11, Robert Haas wrote: On Thu, Mar 31, 2022 at 12:00 PM Julien Rouhaud wrote: Indeed. Then there is a very simple solution for this particular case as wal_records counter may only sometime becomes greater but never less. The corresponding patch is attached. +1 for this approach, and the patch looks good to me. Committed. Thanks a lot! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH] Fix pg_upgrade test from v10
Hello! On 01.07.2022 20:07, Justin Pryzby wrote: Would you add this to to the (next) CF ? Yes, i've put it on september CF. It's silly to say that v9.2 will be supported potentially for a handful more years, but that the upgrade-testing script itself doesn't support that, so developers each have to reinvent its fixups. I've test the attached patch in all variants from v9.5..15 to supported versions 10..master. The script test.sh for 9.5->10 and 9.6->10 upgrades works fine without any patch. In 9.4 there is a regress test largeobject to be patched to allow upgrade test from this version.So i've stopped at 9.5. This is clear that we limit the destination version for upgrade test to the supported versions only. In our case destination versions starting from the 10th inclusively. But is there are a limit for the source version for upgrade test from? See also 20220122183749.go23...@telsasoft.com, where I proposed some of the same things. Thanks a lot, i've add some code for 14+ from https://www.postgresql.org/message-id/flat/20220122183749.GO23027%40telsasoft.com to the attached patch. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 8cba3ca4a68a0d41ff8ac4cd7c92546f093f8c4d Author: Anton A. Melnikov Date: Fri Jun 3 23:50:14 2022 +0300 Fix test for pg_upgrade from 10x and earlier versions. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 2f9b13bf0a..1fd1b6f028 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -60,7 +60,14 @@ my $oldnode = # To increase coverage of non-standard segment size and group access without # increasing test runtime, run these tests with a custom setting. # --allow-group-access and --wal-segsize have been added in v11. -$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]); +my $ver_with_newopts = 11; +my $oldver = $oldnode->{_pg_version}; + +$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]) + if $oldver >= $ver_with_newopts; +$oldnode->init() + if $oldver < $ver_with_newopts; + $oldnode->start; # The default location of the source code is the root of this directory. @@ -147,7 +154,10 @@ if (defined($ENV{oldinstall})) # Initialize a new node for the upgrade. my $newnode = PostgreSQL::Test::Cluster->new('new_node'); -$newnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]); +$newnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]) + if $oldver >= $ver_with_newopts; +$newnode->init() + if $oldver < $ver_with_newopts; my $newbindir = $newnode->config_data('--bindir'); my $oldbindir = $oldnode->config_data('--bindir'); diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql b/src/bin/pg_upgrade/upgrade_adapt.sql index 27c4c7fd01..d47d2075f5 100644 --- a/src/bin/pg_upgrade/upgrade_adapt.sql +++ b/src/bin/pg_upgrade/upgrade_adapt.sql @@ -84,8 +84,8 @@ DO $stmt$ \if :oldpgversion_le13 -- Until v10, operators could only be dropped one at a time, so be careful -- to stick with one command for each drop here. -DROP OPERATOR public.#@# (pg_catalog.int8, NONE); -DROP OPERATOR public.#%# (pg_catalog.int8, NONE); -DROP OPERATOR public.!=- (pg_catalog.int8, NONE); -DROP OPERATOR public.#@%# (pg_catalog.int8, NONE); +DROP OPERATOR IF EXISTS public.#@# (pg_catalog.int8, NONE); +DROP OPERATOR IF EXISTS public.#%# (pg_catalog.int8, NONE); +DROP OPERATOR IF EXISTS public.!=- (pg_catalog.int8, NONE); +DROP OPERATOR IF EXISTS public.#@%# (pg_catalog.int8, NONE); \endif commit bc69d06d6f5bdc31b60452d3af340e6af3faba31 Author: Anton A. Melnikov Date: Sat Jun 4 12:49:55 2022 +0300 Fix test for pg_upgrade from 10x versions. diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh index d4c4320a04..cc710633fb 100644 --- a/src/bin/pg_upgrade/test.sh +++ b/src/bin/pg_upgrade/test.sh @@ -23,7 +23,14 @@ standard_initdb() { # To increase coverage of non-standard segment size and group access # without increasing test runtime, run these tests with a custom setting. # Also, specify "-A trust" explicitly to suppress initdb's warning. - "$1" -N --wal-segsize 1 -g -A trust + # --allow-group-access and --wal-segsize have been added in v11. + initdbopt="-N -A trust" + if [ $OLD_PG_VERSION_NUM -ge 11 ]; then + initdbopt="$initdbopt --wal-segsize 1 --allow-group-access" + fi + + "$1" $initdbopt + if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ] then cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf" @@ -145,6 +152,7 @@ PGHOSTADDR="";unset PGHOSTADDR # Select a non-conflicting port number, similarly to pg_regress.c PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$newsrc"/src/include/pg_config.h | awk '{print $3}'`
Re: [PATCH] Fix pg_upgrade test from v10
Hello! On 02.06.2022 23:57, Andrew Dunstan wrote: 1. There is no mention of why there's a change w.r.t. Cygwin and permissions checks. Maybe it's ok, but it seems off topic and is certainly not referred to in the patch submission. Thanks for the comments! It was my error to change w.r.t. Cygwin. I've fixed it in the second version of the patch. But change in permissons check is correct. If we fix the error with initdb options, we've got the next one while testing upgrade from v10: "files in PGDATA with permission != 640" and the test.sh will end immediately. The thing is that the default permissions have changed in v11+ due to this commit: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c37b3d08ca6873f9d4eaf24c72a90a550970cbb8. Changes of permissions checks in test.sh fix this error. > On 2022-06-01 We 21:37, Michael Paquier wrote: >> A node's pg_version is assigned via _set_pg_version() when creating it >> using PostgreSQL::Test::Cluster::new(). In order to make the >> difference with the set of initdb options to use when setting up the >> old node, it would be simpler to rely on that, no? Version.pm is able >> to handle integer as well as string comparisons for the version >> strings. > 2. As Michael says, we should not be using perl's version module, we should be using the version object built into each PostgreSQL::Test::Cluster instance. Sure, very valuable note. Fixed it in the 2nd version of the patch attached. Also find that i forgot to adjust initdb keys for new node in v15. So there was an error due to wal-segsize mismatch. Fixed it in the 2nd version too. And added patches for other versions. > The client buildfarm does not make use of the in-core facility, as it > has its own module and logic to check after the case of cross-version > upgrades (see PGBuild/Modules/TestUpgradeXversion.pm).. Michael, thanks a lot for your 2c. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 2c8c78faba37f66c2ef88392f58ce8e241772300 Author: Anton A. Melnikov Date: Fri Jun 3 23:50:14 2022 +0300 Fix test for pg_upgrade from 10x versions. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 80437e93b7..d9d97b1b3d 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -58,7 +58,14 @@ my $oldnode = # To increase coverage of non-standard segment size and group access without # increasing test runtime, run these tests with a custom setting. # --allow-group-access and --wal-segsize have been added in v11. -$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]); +my $ver_with_newopts = 11; +my $oldver = $oldnode->{_pg_version}; + +$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]) + if $oldver >= $ver_with_newopts; +$oldnode->init() + if $oldver < $ver_with_newopts; + $oldnode->start; # The default location of the source code is the root of this directory. @@ -145,7 +152,10 @@ if (defined($ENV{oldinstall})) # Initialize a new node for the upgrade. my $newnode = PostgreSQL::Test::Cluster->new('new_node'); -$newnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]); +$newnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]) + if $oldver >= $ver_with_newopts; +$newnode->init() + if $oldver < $ver_with_newopts; my $newbindir = $newnode->config_data('--bindir'); my $oldbindir = $oldnode->config_data('--bindir'); commit dd62bd663167918365ce92577a19d208961a2f2a Author: Anton A. Melnikov Date: Sat Jun 4 11:58:01 2022 +0300 Fix test for pg_upgrade from 10x versions. diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh index f353e565b5..ac9fc15646 100644 --- a/src/bin/pg_upgrade/test.sh +++ b/src/bin/pg_upgrade/test.sh @@ -24,7 +24,13 @@ standard_initdb() { # without increasing test runtime, run these tests with a custom setting. # Also, specify "-A trust" explicitly to suppress initdb's warning. # --allow-group-access and --wal-segsize have been added in v11. - "$1" -N --wal-segsize 1 --allow-group-access -A trust + initdbopt="-N -A trust" + if [ $OLD_PG_VERSION_NUM -ge 11 ]; then + initdbopt="$initdbopt --wal-segsize 1 --allow-group-access" + fi + + "$1" $initdbopt + if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ] then cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf" @@ -136,6 +142,7 @@ PGHOSTADDR="";unset PGHOSTADDR # Select a non-conflicting port number, similarly to pg_regress.c PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$newsrc"/src/include/pg_config.h | awk '{print $3}'` +OLD_PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "
[PATCH] Fix pg_upgrade test from v10
Hello! Found out that test for pg_upgrade (test.sh for 11-14 and 002_pg_upgrade.pl for 15+) doesn't work from 10th versions to higher ones due to incompatible options for initdb and default PGDATA permissions. Here are the patches that may solve this problem. Would be glad to your comments and concerns. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit aa96ef028482fc026850363c776145687cc60fc4 Author: Anton A. Melnikov Date: Thu Jun 2 03:35:26 2022 +0300 Fix test for pg_upgrade from 10x versions. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 75ac768a96..e9b4977fee 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -1,5 +1,6 @@ # Set of tests for pg_upgrade, including cross-version checks. use strict; +use version; use warnings; use Cwd qw(abs_path getcwd); @@ -56,7 +57,13 @@ my $oldnode = # To increase coverage of non-standard segment size and group access without # increasing test runtime, run these tests with a custom setting. # --allow-group-access and --wal-segsize have been added in v11. -$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]); +my ($oldverstr) = `$ENV{oldinstall}/bin/pg_ctl --version` =~ /(\d+\.\d+)/; +my ($oldver) = (version->parse(${oldverstr})); +$oldnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]) + if $oldver >= version->parse('11.0'); +$oldnode->init() + if $oldver < version->parse('11.0'); + $oldnode->start; # The default location of the source code is the root of this directory. commit c62e484e73e0071bc00dffe3b2333fd702108ec6 Author: Anton A. Melnikov Date: Thu Jun 2 03:40:09 2022 +0300 Fix test for pg_upgrade from 10x versions. diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh index f353e565b5..2f5ef1bb9f 100644 --- a/src/bin/pg_upgrade/test.sh +++ b/src/bin/pg_upgrade/test.sh @@ -24,7 +24,13 @@ standard_initdb() { # without increasing test runtime, run these tests with a custom setting. # Also, specify "-A trust" explicitly to suppress initdb's warning. # --allow-group-access and --wal-segsize have been added in v11. - "$1" -N --wal-segsize 1 --allow-group-access -A trust + initdbopt="-N -A trust" + if [ $OLD_PG_VERSION_NUM -ge 11 ]; then + initdbopt="$initdbopt --wal-segsize 1 --allow-group-access" + fi + + "$1" $initdbopt + if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ] then cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf" @@ -136,6 +142,7 @@ PGHOSTADDR="";unset PGHOSTADDR # Select a non-conflicting port number, similarly to pg_regress.c PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$newsrc"/src/include/pg_config.h | awk '{print $3}'` +OLD_PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$oldsrc"/src/include/pg_config.h | awk '{print $3}'` PGPORT=`expr $PG_VERSION_NUM % 16384 + 49152` export PGPORT @@ -240,18 +247,26 @@ pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p " # make sure all directories and files have group permissions, on Unix hosts # Windows hosts don't support Unix-y permissions. +if [ $OLD_PG_VERSION_NUM -lt 11 ]; then + NEW_DIR_PERM=700 + NEW_FILE_PERM=600 +else + NEW_DIR_PERM=750 + NEW_FILE_PERM=640 +fi + case $testhost in - MINGW*|CYGWIN*) ;; - *) if [ `find "$PGDATA" -type f ! -perm 640 | wc -l` -ne 0 ]; then - echo "files in PGDATA with permission != 640"; + MINGW*) ;; + *) if [ `find "$PGDATA" -type f ! -perm $NEW_FILE_PERM | wc -l` -ne 0 ]; then + echo "files in PGDATA with permission != $NEW_FILE_PERM"; exit 1; fi ;; esac case $testhost in - MINGW*|CYGWIN*) ;; - *) if [ `find "$PGDATA" -type d ! -perm 750 | wc -l` -ne 0 ]; then - echo "directories in PGDATA with permission != 750"; + MINGW*) ;; + *) if [ `find "$PGDATA" -type d ! -perm $NEW_DIR_PERM | wc -l` -ne 0 ]; then + echo "directories in PGDATA with permission != $NEW_DIR_PERM"; exit 1; fi ;; esac
Re: make MaxBackends available in _PG_init
On 12.05.2022 00:01, Nathan Bossart wrote: On Wed, May 11, 2022 at 04:18:10PM -0400, Robert Haas wrote: OK, I have committed 0001 now. Thanks! Maybe remove the first part from the patchset? Because now the Patch Tester is giving apply error for the first part and can't test the other. http://cfbot.cputube.org/patch_38_3614.log With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: make MaxBackends available in _PG_init
Hello! On 19.04.2022 18:46, Nathan Bossart wrote: Okay, I did it this way in v5. Recently i ran into a problem that it would be preferable in our extended pg_stat_statements to use MaxBackEnds in _PG_Init() but it's equal to zero here. And i was very happy to find this patch and this thread. It was not only very interesting and informative for me but also solves the current problem. Patch is applied cleanly on current master. Tests under Linux and Windows were successful. I've tried this patch with our extended pg_stat_statements and checked that MaxBackends has the correct value during extra shmem allocating. Thanks a lot! +1 for this patch. I have a little doubt about the comment in postmaster.c: "Now that loadable modules have had their chance to alter any GUCs". Perhaps this comment is too general. Not sure that it is possible to change any arbitrary GUC here. And maybe not bad to add Assert(size > 0) in RequestAddinShmemSpace() to see that the size calculation in contrib was wrong. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Possible fails in pg_stat_statements test
Hello! On 30.03.2022 22:36, Robert Haas wrote: I don't think that the idea of "extra" WAL records is very principled. It's pretty vague what "extra" means, and your definition seems to be basically "whatever would be needed to make this test case pass." I think the problem is basically with the test cases's idea that # of WAL records and # of table rows ought to be equal. I think that's just false. In general, we'd also have to worry about index insertions, which would provoke variable numbers of WAL records depending on whether they cause a page split. And we'd have to worry about TOAST table insertions, which could produce different numbers of records depending on the size of the data, the configured block size and TOAST threshold, and whether the TOAST table index incurs a page split. Thank you very much for this information. I really didn't take it into account. If it's true that this test case sometimes randomly fails, then we ought to fix that somehow, maybe by just removing this particular check from the test case, or changing it to >=, or something like that. But I don't think adding a new counter is the right idea. Indeed. Then there is a very simple solution for this particular case as wal_records counter may only sometime becomes greater but never less. The corresponding patch is attached. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 742d16413ebfe4f556e0f676a3a8785b638d245a Author: Anton A. Melnikov Date: Thu Mar 31 18:00:07 2022 +0300 Fix possible fails in pg_stat_statements test via test rework. diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8706409739 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -234,18 +234,18 @@ SET pg_stat_statements.track_utility = FALSE; SELECT query, calls, rows, wal_bytes > 0 as wal_bytes_generated, wal_records > 0 as wal_records_generated, -wal_records = rows as wal_records_as_rows +wal_records >= rows as wal_records_ge_rows FROM pg_stat_statements ORDER BY query COLLATE "C"; - query | calls | rows | wal_bytes_generated | wal_records_generated | wal_records_as_rows + query | calls | rows | wal_bytes_generated | wal_records_generated | wal_records_ge_rows ---+---+--+-+---+- DELETE FROM pgss_test WHERE a > $1| 1 |1 | t | t | t - DROP TABLE pgss_test | 1 |0 | t | t | f + DROP TABLE pgss_test | 1 |0 | t | t | t INSERT INTO pgss_test VALUES(generate_series($1, $2), $3) | 1 | 10 | t | t | t SELECT pg_stat_statements_reset() | 1 |1 | f | f | f SELECT query, calls, rows, +| 0 |0 | f | f | t wal_bytes > $1 as wal_bytes_generated, +| | | | | wal_records > $2 as wal_records_generated, +| | | | | - wal_records = rows as wal_records_as_rows+| | | | | + wal_records >= rows as wal_records_ge_rows +| | | | | FROM pg_stat_statements ORDER BY query COLLATE "C"| | | | | SET pg_stat_statements.track_utility = FALSE | 1 |0 | f | f | t UPDATE pgss_test SET b = $1 WHERE a > $2 | 1 |3 | t | t | t diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql index dffd2c8c18..a01f183727 100644 --- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql +++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql @@ -122,7 +122,7 @@ SET pg_stat_statements.track_utility = FALSE; SELECT query, calls, rows, wal_bytes > 0 as wal_bytes_generated, wal_records > 0 as wal_records_generated, -wal_records = rows as wal_records_as_rows +wal_records >= rows as wal_records_ge_rows FROM pg_stat_statements ORDER BY query COLLATE "C"; --
Re: Possible fails in pg_stat_statements test
Hello, thank you much for your attention and for your thought. On 20.03.2022 20:36, Andres Freund wrote: This patch introduces an additional counter of wal records not related to the query being executed. They're not unrelated though. Yes, i've missformulated here. Indeed there is a relation but it seems of the some other kind. It would be nice to clarify the terminology. Maybe divide WAL records into two kinds: 1) WAL records, the number of which depends on the given query itself. (say strong relation) 2) WAL records, the number of which depends on the given query and on the previous query history. (say weak relation) So modified in the patch wal_records counter will belongs to the first kind while the number of wal records due to on-access pruning and new clog page generation to the second. -many. For read-only queries the generated WAL due to on-access pruning can be a significant factor in performance. Removing that information makes pg_stat_statments *less* useful. A separate counter for the second type of records, say, extra_wal_records, will not only remove this disadvantage, but on the contrary will provide additional information. The next version of the patch with additional counter is attached. Really, now it is clearly seen that sometimes WAL due to on-access pruning can be a significant factor ! After pgbench -c10 -t300: postgres=# SELECT substring(query for 30), wal_records, extra_wal_records FROM pg_stat_statements WHERE extra_wal_records != 0; substring| wal_records | extra_wal_records +-+--- UPDATE pgbench_tellers SET tba |4557 |15 create table pgbench_history(t | 48 | 1 create table pgbench_branches( | 40 | 1 UPDATE pgbench_accounts SET ab |5868 | 1567 drop table if exists pgbench_a | 94 | 1 UPDATE pgbench_branches SET bb |5993 |14 SELECT abalance FROM pgbench_a | 0 | 7 (7 rows) Can the test failures be encountered without such an elaborate setup? If not, then I don't really see why we need to do anything here? There was a real bug report from our test department. They do long time repetitive tests and sometimes met this failure. So i suppose there is a non-zero probability that such error can occur in the one-shot test as well. The sequence given in the first letter helps to catch this failure quickly. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 2cc4234754fc815528ed299b64c1a3f1e1991617 Author: Anton A. Melnikov Date: Wed Mar 30 08:54:51 2022 +0300 Fix possible fails in pg_stat_statements test via taking into account WAL records due to on-access pruning and new clog page generation. diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile index 7fabd96f38..145b2617d7 100644 --- a/contrib/pg_stat_statements/Makefile +++ b/contrib/pg_stat_statements/Makefile @@ -6,11 +6,12 @@ OBJS = \ pg_stat_statements.o EXTENSION = pg_stat_statements -DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.8--1.9.sql \ - pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \ - pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \ - pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \ - pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql +DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.9--1.10.sql \ + pg_stat_statements--1.8--1.9.sql pg_stat_statements--1.7--1.8.sql \ + pg_stat_statements--1.6--1.7.sql pg_stat_statements--1.5--1.6.sql \ + pg_stat_statements--1.4--1.5.sql pg_stat_statements--1.3--1.4.sql \ + pg_stat_statements--1.2--1.3.sql pg_stat_statements--1.1--1.2.sql \ + pg_stat_statements--1.0--1.1.sql PGFILEDESC = "pg_stat_statements - execution statistics of SQL statements" LDFLAGS_SL += $(filter -lm, $(LIBS)) diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..9a375d796f 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -234,21 +234,23 @@ SET pg_stat_statements.track_utility = FALSE; SELECT query, calls, rows, wal_bytes > 0 as wal_bytes_generated, wal_records > 0 as wal_records_generated, -wal_records = rows as wal_records_as_rows +wal_records = rows as wal_records_as_rows, +extra_wal_records IS NOT NULL as extra_wal_records_supported FROM pg_stat_statements ORDER BY query COLLATE "C"; - query | calls | rows | wal_bytes_generated | wal_records_gener
Re: Possible fails in pg_stat_statements test
Hello! Here is the second version of the patch rebased onto the current master. No logical changes. All other attached files from previous letter are actual. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 04ce779eb25fec3364c216202b7d7dbd3ed79819 Author: Anton A. Melnikov Date: Sun Mar 20 19:34:58 2022 +0300 Fix possible fails in pg_stat_statements test via taking into account non-query wal records. diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 9e525a6ad3..56ed7e0fde 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -1370,7 +1370,7 @@ pgss_store(const char *query, uint64 queryId, e->counters.blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time); e->counters.blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time); e->counters.usage += USAGE_EXEC(total_time); - e->counters.wal_records += walusage->wal_records; + e->counters.wal_records += (walusage->wal_records - walusage->non_query_wal_recs); e->counters.wal_fpi += walusage->wal_fpi; e->counters.wal_bytes += walusage->wal_bytes; diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index 4656f1b3db..f09abba04e 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -21,6 +21,7 @@ #include "access/xlog.h" #include "access/xloginsert.h" #include "catalog/catalog.h" +#include "executor/instrument.h" #include "miscadmin.h" #include "pgstat.h" #include "storage/bufmgr.h" @@ -209,6 +210,11 @@ heap_page_prune_opt(Relation relation, Buffer buffer) ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, limited_ts, , NULL); + /* Take into account that heap_page_prune() just generated a new + * wal record with zero xl_xid that is not related to current query. + */ + pgWalUsage.non_query_wal_recs++; + /* * Report the number of tuples reclaimed to pgstats. This is * ndeleted minus the number of newly-LP_DEAD-set items. diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index 3d9088a704..60bc6c7542 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -38,6 +38,7 @@ #include "access/xlog.h" #include "access/xloginsert.h" #include "access/xlogutils.h" +#include "executor/instrument.h" #include "miscadmin.h" #include "pg_trace.h" #include "pgstat.h" @@ -956,6 +957,12 @@ WriteZeroPageXlogRec(int pageno) XLogBeginInsert(); XLogRegisterData((char *) (), sizeof(int)); (void) XLogInsert(RM_CLOG_ID, CLOG_ZEROPAGE); + + /* + * Consider that a new unrelated to current query wal record + * with zero xl_xid has just been created. + */ + pgWalUsage.non_query_wal_recs++; } /* diff --git a/src/backend/executor/instrument.c b/src/backend/executor/instrument.c index c5ff02a842..214fb3cc45 100644 --- a/src/backend/executor/instrument.c +++ b/src/backend/executor/instrument.c @@ -268,6 +268,7 @@ WalUsageAdd(WalUsage *dst, WalUsage *add) dst->wal_bytes += add->wal_bytes; dst->wal_records += add->wal_records; dst->wal_fpi += add->wal_fpi; + dst->non_query_wal_recs += add->non_query_wal_recs; } void @@ -276,4 +277,5 @@ WalUsageAccumDiff(WalUsage *dst, const WalUsage *add, const WalUsage *sub) dst->wal_bytes += add->wal_bytes - sub->wal_bytes; dst->wal_records += add->wal_records - sub->wal_records; dst->wal_fpi += add->wal_fpi - sub->wal_fpi; + dst->non_query_wal_recs += add->non_query_wal_recs - sub->non_query_wal_recs; } diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h index 1b7157bdd1..0d83f37a3c 100644 --- a/src/include/executor/instrument.h +++ b/src/include/executor/instrument.h @@ -49,6 +49,11 @@ typedef struct WalUsage int64 wal_records; /* # of WAL records produced */ int64 wal_fpi; /* # of WAL full page images produced */ uint64 wal_bytes; /* size of WAL records produced */ + /* + * Number of WAL records unrelated to current query. In particular due to + * heap_page_prune_opt() or WriteZeroPageXlogRec(). + */ + int64 non_query_wal_recs; } WalUsage; /* Flag bits included in InstrAlloc's instrument_options bitmask */
Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Hello! On 26.01.2022 16:43, Andrei Zubkov wrote: >> >> If you're worried about some external table having a NOT NULL >> constraint for >> those fields, how about returning NaN instead? That's a valid value >> for a >> double precision data type. > > I don't know for sure what we can expect to be wrong here. My opinion > is to use NULLs, as they seems more suitable here. But this can be done > only if we are not expecting significant side effects. Let me suggest for your consideration an additional reset request flag that can be used to synchronize reset in a way similar to interrupt handling. External reset can set this flag immediately. Then pg_stat_statements will wait for the moment when the required query hits into the statistics and only at this moment really reset the aux statistics, write a new timestamp and clear the flag. At the time of real reset, total_time will be determined, and pg_stat_statements can immediately initialize min and max correctly. From reset to the next query execution the aux view will give old correct values so neither NaNs nor NULLs will be required. Also we can put the value of reset request flag into the aux view to give feedback to the external application that reset was requested. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Hi, Andrey! I've checked the 5th version of the patch and there are some remarks. >I've created a new view named pg_stat_statements_aux. But for now both >views are using the same function pg_stat_statements which returns all >fields. It seems reasonable to me - if sampling solution will need all >values it can query the function. Agreed, it might be useful in some cases. >But it seems "stats_reset" term is not quite correct in this case. The >"stats_since" field holds the timestamp of hashtable entry, but not the >reset time. The same applies to aux_stats_since - for new statement it >holds its entry time, but in case of reset it will hold the reset time. Thanks for the clarification. Indeed if we mean the word 'reset' as the removal of all the hashtable entries during pg_stat_statements_reset() then we shouldn't use it for the first occurrence timestamp in the struct pgssEntry. So with the stats_since field everything is clear. On the other hand aux_stats_since field can be updated for two reasons: 1) The same as for stats_since due to first occurrence of entry in the hashtable. And it will be equal to stats_since timestamp in that case. 2) Due to an external reset from an upper level sampler. I think it's not very important how to name this field, but it would be better to mention both these reasons in the comment. As for more important things, if the aux_min_time initial value is zero like now, then if condition on line 1385 of pg_stat_statements.c will never be true and aux_min_time will remain zero. Init aux_min_time with INT_MAX can solve this problem. It is possible to reduce size of entry_reset_aux() function via removing if condition on line 2606 and entire third branch from line 2626. Thanks to check in 2612 this will work in all cases. Also it would be nice to move the repeating several times lines 2582-2588 into separate function. I think this can make entry_reset_aux() more shorter and clearer. In general, the 5th patch applies with no problems, make check-world and CI gives no error and patch seems to be closely to become RFC. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Possible fails in pg_stat_statements test
Hello, There are some places in the pg_state_statement's regress test where the bool result of comparison between the number of rows obtained and wal_records generated by query should be displayed. Now counting the number of wal_records for some query in pg_state_statement is done by the global pgWalUsage.wal_records counter difference calculation. During query execution the extra wal_records may appear that are not relate to the query. There are two reasons why this might happen: 1) Owing to the hit into pruning of some page in optional pruning function (heap_page_prune_opt()). 2) When a new page is required for a new xid in clog and WriteZeroPageXlogRec() was called. In both cases an extra wal record with zero xl_xid is generated, so wal_records counter gives an incremented value for this query and pg_stat_statement test will fail. This patch introduces an additional counter of wal records not related to the query being executed. Due to this counter pg_stat_statement finds out the number of wal records that are not relevant to the query and does not include them in the per query statistic. This removes the possibility of the error described above. There is a way to reproduce this error when patch is not applied: 1) start server with "shared_preload_libraries = 'pg_stat_statements'" string in the postgresql.conf; 2) replace makefile in contrib/pg_stat_statements with attached one; 3) replace test file contrib/pg_stat_statements/sql/pg_stat_statements.sql and expected results contrib/pg_stat_statements/expected/pg_stat_statements.out with shorter versions from attached files; 4) copy test.sh to contrib/pg_stat_statements and make sure that PGHOME point to your server; 5) cd to contrib/pg_stat_statements and execute: export ITER=1 && while ./start.sh || break; export ITER=$(($ITER+1)); do :; done Usually 100-200 iterations will be enough. To catch the error more faster one can add wal_records column to SELECT in line 26 of contrib/pg_stat_statements/sql/pg_stat_statements.sql as followes: SELECT query, calls, rows, wal_records, and replace the contrib/pg_stat_statements/expected/pg_stat_statements.out with attached pg_stat_statements-fast.out With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company commit 3f4659a8d8a390bb24fbc6f82a6add7949fbebe2 Author: Anton A. Melnikov Date: Fri Jan 14 10:54:35 2022 +0300 Fix possible fails in pg_stat_statements test via taking into account non-query wal records. diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 082bfa8f77..bd437aefc3 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -1370,7 +1370,7 @@ pgss_store(const char *query, uint64 queryId, e->counters.blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time); e->counters.blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time); e->counters.usage += USAGE_EXEC(total_time); - e->counters.wal_records += walusage->wal_records; + e->counters.wal_records += (walusage->wal_records - walusage->non_query_wal_recs); e->counters.wal_fpi += walusage->wal_fpi; e->counters.wal_bytes += walusage->wal_bytes; diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index 3201fcc52b..41f17ab97c 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -20,6 +20,7 @@ #include "access/transam.h" #include "access/xlog.h" #include "catalog/catalog.h" +#include "executor/instrument.h" #include "miscadmin.h" #include "pgstat.h" #include "storage/bufmgr.h" @@ -208,6 +209,11 @@ heap_page_prune_opt(Relation relation, Buffer buffer) ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, limited_ts, , NULL); + /* Take into account that heap_page_prune() just generated a new + * wal record with zero xl_xid that is not related to current query. + */ + pgWalUsage.non_query_wal_recs++; + /* * Report the number of tuples reclaimed to pgstats. This is * ndeleted minus the number of newly-LP_DEAD-set items. diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index de787c3d37..e944fc3b1a 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -38,6 +38,7 @@ #include "access/xlog.h" #include "access/xloginsert.h" #include "access/xlogutils.h" +#include "executor/instrument.h" #include "miscadmin.h" #include "pg_trace.h" #include "pgstat.h" @@ -955,6 +956,12 @@ WriteZeroPageXlogRec(int pageno) XLogBeginInsert(); XLogRegisterData((char *) (), sizeof(int)); (void) XLogInsert(RM_CLOG_I
Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
On 03.12.2021 19:55, Andrei Zubkov wrote: On Fri, 2021-12-03 at 17:03 +0300, Andrei Zubkov wrote: ... What if we'll create a new view for such resettable fields? It will make description of views and reset functions in the docs much more clear. Hi, Andrey! I completely agree that creating a separate view for these new fields is the most correct thing to do. As for variable names, the term global is already used for global statistics, in particular in the struct pgssGlobalStats. The considered timestamps refer to per statement level as pointed out in the struct pgssEntry's comment. Therefore, i think it's better to rename gstats_since to just stats_reset in the same way. Also it might be better to use the term 'auxiliary' and use the same approach as for existent similar vars. So internally it might look something like this: double aux_min_time[PGSS_NUMKIND]; double aux_max_time[PGSS_NUMKIND]; TimestampTz aux_stats_reset; And at the view level: aux_min_plan_time float8, aux_max_plan_time float8, aux_min_exec_time float8, aux_max_exec_time float8, aux_stats_reset timestamp with time zone Functions names might be pg_stat_statements_aux() and pg_stat_statements_aux_reset(). The top-level application may find out period the aux extrema were collected by determining which reset was closer as follows: data_collect_period = aux_stats_reset > stats_reset ? now - aux_stats_reset : now - stats_reset; and decide not to trust this data if the period was too small. For correct work aux_stats_reset must be updated and aux extrema values must be reset simultaneously with updating of stats_reset therefore some synchronization needed to avoid race with possible external reset. I've tested the patch v4 and didn't find any evident problems. Contrib installcheck said: test pg_stat_statements ... ok 163 ms test oldextversions ... ok 46 ms With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
On 07.10.2021 15:31, Andrei Zubkov wrote: > There is an issue with this patch. It's main purpose is the ability to > calculate values of pg_stat_statements view > [...] > Does addition of resettable min/max metrics to the > pg_stat_statemets view seems reasonable here? Hello, Andrey! I think it depends on what the slow top level sampler wants. Let define the current values in pg_stat_statements for some query as gmin and gmax. It seems to be a two main variants: 1) If top level sampler wants to know for some query the min and max values for the entire watch time, then existing gmin and gmax in pg_stat_statements are sufficient. The top level sampler can clarify top_min and top_max at every slow sample as follows: top_max = gmax > top_max ? gmax : top_max; top_min = gmin < top_min ? gmin : top_min; This should work regardless of whether there was a reset between samples or not. 2) The second case happens when the top level sampler wants to know the min and max values for sampling period. In that case i think one shouldn't not use gmin and gmax and especially reset them asynchronously from outside because its lifetime and sampling period are independent values and moreover someone else might need gmin and gmax as is. So i suppose that additional vars loc_min and loc_max is a right way to do it. If that, at every top sample one need to replace not clarify the new top values as follows: top_max = loc_max; loc_max = 0; top_min = loc_min; loc_min = INT_MAX; And one more thing, if there was a reset of stats between two samples, then i think it is the best to ignore the new values, since they were obtained for an incomplete period. This patch, thanks to the saved time stamp, makes possible to determine the presence of reset between samples and there should not be a problem to realize such algorithm. The patch is now applied normally, all my tests were successful. The only thing I could suggest to your notice is a small cosmetic edit to replace the numeric value in #define on line 1429 of pg_stat_statements.c by one of the constants defined above. Best regards, Anton Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company