Re: Estimating HugePages Requirements?
On Sat, Aug 28, 2021 at 11:00:11AM +0900, Michael Paquier wrote: > On Fri, Aug 27, 2021 at 08:16:40PM +, Bossart, Nathan wrote: > > On 8/27/21, 12:39 PM, "Andres Freund" wrote: > >> One thing I wonder is if this wouldn't better be dealt with in a more > >> generic > >> way. While this is the most problematic runtime computed GUC, it's not the > >> only one. What if we introduced a new shared_memory_size GUC, and made > >> --describe-config output it? Perhaps adding --describe-config=guc-name? > >> > >> I also wonder if we should output the number of hugepages needed instead of > >> the "raw" bytes of shared memory. The whole business about figuring out the > >> huge page size, dividing the shared memory size by that and then rounding > >> up > >> could be removed in that case. Due to huge_page_size it's not even > >> immediately > >> obvious which huge page size one should use... > > > > I like both of these ideas. > > That pretty much looks like -C in concept, isn't it? Except that you > cannot get the actual total shared memory value because we'd do this > operation before loading shared_preload_libraries and miss any amount > asked by extensions. There is a problem similar when attempting to do > postgres -C data_checksums, for example, which would output an > incorrect value even if the cluster has data checksums enabled. Since we don't want to try to allocate the huge pages, and we also don't want to compute based on shared_buffers alone, did anyone consider if pg_controldata is the right place to put this ? It includes a lot of related stuff: max_connections setting: 100 max_worker_processes setting: 8 - (added in 2013: 6bc8ef0b7f1f1df3998745a66e1790e27424aa0c) max_wal_senders setting: 10 max_prepared_xacts setting: 2 max_locks_per_xact setting: 64 I'm not sure if there's any reason these aren't also shown (?) autovacuum_max_workers - added in 2007: e2a186b03 max_predicate_locks_per_xact - added in 2011: dafaa3efb max_logical_replication_workers max_replication_slots -- Justin
Re: Remove Value node struct
On Wed, Aug 25, 2021 at 9:49 PM Robert Haas wrote: > > On Wed, Aug 25, 2021 at 9:20 AM Peter Eisentraut > wrote: > > This change removes the Value struct and node type and replaces them > > by separate Integer, Float, String, and BitString node types that are > > proper node types and structs of their own and behave mostly like > > normal node types. > > +1. I noticed this years ago and never thought of doing anything about > it. I'm glad you did think of it... +1, it also bothered me in the past.
Re: Estimating HugePages Requirements?
On Fri, Aug 27, 2021 at 08:16:40PM +, Bossart, Nathan wrote: > On 8/27/21, 12:39 PM, "Andres Freund" wrote: >> One thing I wonder is if this wouldn't better be dealt with in a more generic >> way. While this is the most problematic runtime computed GUC, it's not the >> only one. What if we introduced a new shared_memory_size GUC, and made >> --describe-config output it? Perhaps adding --describe-config=guc-name? >> >> I also wonder if we should output the number of hugepages needed instead of >> the "raw" bytes of shared memory. The whole business about figuring out the >> huge page size, dividing the shared memory size by that and then rounding up >> could be removed in that case. Due to huge_page_size it's not even >> immediately >> obvious which huge page size one should use... > > I like both of these ideas. That pretty much looks like -C in concept, isn't it? Except that you cannot get the actual total shared memory value because we'd do this operation before loading shared_preload_libraries and miss any amount asked by extensions. There is a problem similar when attempting to do postgres -C data_checksums, for example, which would output an incorrect value even if the cluster has data checksums enabled. -- Michael signature.asc Description: PGP signature
Re: [PATCH] Disable bgworkers during servers start in pg_upgrade
On Sat, Aug 28, 2021 at 9:40 AM Michael Paquier wrote: > > On Fri, Aug 27, 2021 at 12:28:42PM -0700, Andres Freund wrote: > > Isn't that just going to end up with extension code erroring out and/or > > blocking waiting for a bgworker to start? > > Well, that's the point to block things during an upgrade. Do you have > a list of requirements you'd like to see satisfied here? POWA would > be fine with blocking the execution of bgworkers AFAIK (Julien feel > free to correct me here if necessary). Yes, no problem at all, whether the bgworker isn't registered or never launched. The bgworker isn't even mandatory anymore since a few years, as we introduced an external daemon to collect metrics on a distant database.
Re: [PATCH] Disable bgworkers during servers start in pg_upgrade
On Sat, Aug 28, 2021 at 3:28 AM Andres Freund wrote: > > > On Fri, Aug 27, 2021 at 7:31 AM Michael Paquier wrote: > > > > > Wouldn't it be better to block things at the source, as of > > > RegisterBackgroundWorker()? And that would keep track of the control > > > we have on bgworkers in a single place. I also think that we'd better > > > document something about that either in bgworker.sgml or pg_upgrade's > > > page. > > Isn't that just going to end up with extension code erroring out and/or > blocking waiting for a bgworker to start? But there's no API to wait for the start of a non-dynamic bgworker or check for it right? So I don't see how the extension code could wait or error out. As far as I know the only thing you can do is RegisterBackgroundWorker() in your _PG_init() code and hope that the server is correctly configured. The only thing that third-party code could I think is try to check if the bgworker could be successfully registered or not as I mentioned in [1]. Maybe extra paranoid code may add such check in all executor hook but the overhead would be so terrible that no one would use such an extension anyway. [1] https://www.postgresql.org/message-id/CAOBaU_ZtR88x3Si6XwprqGo8UZNLncAQrD_-wc67sC=acO3g=w...@mail.gmail.com
Re: [PATCH] Disable bgworkers during servers start in pg_upgrade
On Fri, Aug 27, 2021 at 12:28:42PM -0700, Andres Freund wrote: > Isn't that just going to end up with extension code erroring out and/or > blocking waiting for a bgworker to start? Well, that's the point to block things during an upgrade. Do you have a list of requirements you'd like to see satisfied here? POWA would be fine with blocking the execution of bgworkers AFAIK (Julien feel free to correct me here if necessary). It could be possible that preventing extension code to execute this way could prevent hazards as well. The idea from upthread to prevent any writes and/or WAL activity is not really different as extensions may still generate an error because of pg_upgrade's safety measures we'd put in, no? -- Michael signature.asc Description: PGP signature
Re: Async-unsafe functions in signal handlers
> 28 авг. 2021 г., в 07:05, Andres Freund написал(а): > > However, we have a > bandaid that deals with possible hangs, by SIGKILLing when processes don't > shut down (at that point things have already gone quite south, so that's not > an issue). Thanks for the explanation. I can see that child process SIGKILL machinery was introduced by 82233ce7ea42d6ba519aaec63008aff49da6c7af commit to fix a malloc() deadlock in quickdie() signal handler. As a result, all child processes that die too long are killed in the ServerLoop() with SIGKILL. But bgworker_die() is a problem as we initialize bgworkers right before ServerLoop(). So we can face malloc() deadlock on postmaster startup (before ServerLoop() started). Maybe we should simply use write() and exit() instead of ereport() for bgworker_die()? Best regards, Denis Smirnov | Developer s...@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Re: automatic analyze: readahead - add "IO read time" log message
Greetings, * Stephen Frost (sfr...@snowman.net) wrote: > * Egor Rogov (e.ro...@postgrespro.ru) wrote: > > On 11.02.2021 01:10, Stephen Frost wrote: > > >* Heikki Linnakangas (hlinn...@iki.fi) wrote: > > >>On 05/02/2021 23:22, Stephen Frost wrote: > > >>>Unless there's anything else on this, I'll commit these sometime next > > >>>week. > > >>One more thing: Instead of using 'effective_io_concurrency' GUC directly, > > >>should call get_tablespace_maintenance_io_concurrency(). > > >Ah, yeah, of course. > > > > > >Updated patch attached. > > > > I happened to notice that get_tablespace_io_concurrency() is called instead > > of get_tablespace_maintenance_io_concurrency(). It doesn't look right, no? > > Hah, indeed you're right. Will fix. Found this under a bit of a pile in my todo list. :) Fix pushed. Thanks! Stephen signature.asc Description: PGP signature
Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)
> On Aug 23, 2021, at 1:46 PM, Stephen Frost wrote: > > I'd much rather we go down the path that Robert had suggested where we > find a way to make a connection between the tenant role and everything > that they create, and leave everything that is outside of that box on > the other side of the 'wall'. I am coming around to this way of thinking. The main difficulty here stems (as you know) from how CREATEROLE was implemented. You and Tom had conversations about that back in 2005 [1], and Tom even suggested perhaps roles have owners: > * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote: > Possibly for 8.2 we could invent a notion of roles having owners. > Offhand I don't see any harm in letting non-CREATEROLE users create > non-login roles, and manipulate the membership of roles they have > created (or that have been assigned to them by a superuser). On the > other hand, it could be that the WITH ADMIN OPTION feature is already > sufficient for this. This really needs some thought ... Making roles owners of roles they create, and giving them the power to manipulate objects which belong to roles they own (recursively), seems to solve most of our problems we have been discussing. The remaining problem is that roles without createrole or superuser cannot create other roles. We don't want tenants to need either of those things, at least not as they are currently defined. We could either modify the createrole privilege to be far less powerful, or create a new privilege. If role owners can alter and drop roles they own (and ones those roles own, etc.) then we could redefine CREATEROLE to really just mean the ability to create new roles. The ability to alter or drop roles would not stem from having CREATEROLE, but rather from owning the role. For setups where one admin role has CREATEROLE and creates all other roles (except the superuser which created the admin) nothing changes. In setups with multiple admins, where none own the others, each admin would have its own fiefdom, managing everything downstream from itself, but having no special privilege over the other fiefdoms. I think that setup wasn't implemented for 8.1 more for lack of time than because it was unwanted. Alternately, we could just create a new privilege parallel to CREATEROLE, but that seems confusing more than helpful. Thoughts? [1] https://www.postgresql.org/message-id/17554.1120258001%40sss.pgh.pa.us — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pgsql: Deduplicate choice of horizon for a relation procarray.c.
Greetings, * Andres Freund (and...@anarazel.de) wrote: > As the code in question was only introduced in dc7420c2c92 it seems worth > backpatching this change as well, otherwise 14 will look different from all > other branches. Interestingly, these patches ended up actually introducing a difference between 14 and master in the form of: GlobalVisTestFor(Relation rel) - GlobalVisState *state; + GlobalVisState *state = NULL; being done on master but not in the 14 stable branch, leading to, at least for me: .../src/backend/storage/ipc/procarray.c: In function ‘GlobalVisTestFor’: .../src/backend/storage/ipc/procarray.c:4054:9: warning: ‘state’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4054 | return state; | ^ Seems like we should include that change in 14 too, to get rid of the above warning and to make that bit of code the same too..? Thanks! Stephen signature.asc Description: PGP signature
Re: New predefined roles- 'pg_read/write_all_data'
Greetings, * Michael Banck (michael.ba...@credativ.de) wrote: > On Thu, Apr 01, 2021 at 04:00:06PM -0400, Stephen Frost wrote: > > diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml > > index d171b13236..fe0bdb7599 100644 > > --- a/doc/src/sgml/user-manag.sgml > > +++ b/doc/src/sgml/user-manag.sgml > > @@ -518,6 +518,24 @@ DROP ROLE doomed_role; > > > > > > > > + > > + pg_read_all_data > > + Read all data (tables, views, sequences), as if having SELECT > > + rights on those objects, and USAGE rights on all schemas, even > > without > > + having it explicitly. This role does not have the role attribute > > + BYPASSRLS set. If RLS is being used, an > > administrator > > + may wish to set BYPASSRLS on roles which this > > role is > > + GRANTed to. > > + > > + > > + pg_write_all_data > > + Write all data (tables, views, sequences), as if having > > INSERT, > > + UPDATE, and DELETE rights on those objects, and USAGE rights on all > > + schemas, even without having it explicitly. This role does not > > have the > > + role attribute BYPASSRLS set. If RLS is being > > used, > > + an administrator may wish to set BYPASSRLS on > > roles > > + which this role is GRANTed to. > > + > > Shouldn't those "SELECT", "INSERT" etc. be wrapped in tags? Yeah, good point, fixed. Thanks! Stephen signature.asc Description: PGP signature
Re: Async-unsafe functions in signal handlers
Hi, On 2021-08-27 23:51:27 +0300, Denis Smirnov wrote: > > 26 авг. 2021 г., в 23:39, Tom Lane написал(а): > > > > (BTW, I think it's pretty silly to imagine that adding backtrace() > > calls inside ereport is making things any more dangerous. ereport > > has pretty much always carried a likelihood of calling malloc(), > > for example.) > > I have taken a look through the signal handlers and found out that many of > them use malloc() via ereport() and elog(). Here is the list: > > SIGUSR1 > - procsignal_sigusr1_handler(): autoprewarm, autovacuum, bgworker, bgwriter, > checkpointer, pgarch, startup, walwriter, walreciever, walsender There shouldn't be meaningful uses of elog/ereport() inside procsignal_sigusr1_handler(). The exception I found was an elog(FATAL) for unreachable code. > - sigusr1_handler(): postmaster > SIGHUP: > - SIGHUP_handler(): postmaster > SIGCHLD: > - reaper(): postmaster I think these runs in a very controlled set of circumstances because most of postmaster runs with signals masked. > SIGFPE: > - FloatExceptionHandler(): autovacuum, bgworker, postgres, plperl Yep, although as discussed this might not be a "real" problem because it should only run during an instruction triggering an FPE. > SIGQUIT: > - quickdie(): postgres Yes, this is an issue. I've previously argued for handling this via write() and _exit(), instead of the full ereport() machinery. However, we have a bandaid that deals with possible hangs, by SIGKILLing when processes don't shut down (at that point things have already gone quite south, so that's not an issue). > SIGTERM: > - bgworker_die(): bgworker Bad. > SIGALRM: > - handle_sig_alarm(): autovacuum, bgworker, postmaster, startup, walsender, > postgres I don't think there reachable elogs in there. I'm not concerned about e.g. elog(FATAL, "timeout index %d out of range 0..%d", index, num_active_timeouts - 1); because that's not something that should ever be reachable in a production scenario. If it is, there's bigger problems. Perhaps we ought to have a version of Assert() that's enabled in production builds as well, and that outputs the error messages via write() and then _exit()s? Greetings, Andres Freund
Re: Async-unsafe functions in signal handlers
> 26 авг. 2021 г., в 23:39, Tom Lane написал(а): > > (BTW, I think it's pretty silly to imagine that adding backtrace() > calls inside ereport is making things any more dangerous. ereport > has pretty much always carried a likelihood of calling malloc(), > for example.) I have taken a look through the signal handlers and found out that many of them use malloc() via ereport() and elog(). Here is the list: SIGUSR1 - procsignal_sigusr1_handler(): autoprewarm, autovacuum, bgworker, bgwriter, checkpointer, pgarch, startup, walwriter, walreciever, walsender - sigusr1_handler(): postmaster SIGFPE: - FloatExceptionHandler(): autovacuum, bgworker, postgres, plperl SIGHUP: - SIGHUP_handler(): postmaster SIGCHLD: - reaper(): postmaster SIGQUIT: - quickdie(): postgres SIGTERM: - bgworker_die(): bgworker SIGALRM: - handle_sig_alarm(): autovacuum, bgworker, postmaster, startup, walsender, postgres I suspect there are lots of potential ways to lock on malloc() inside any of this handlers. An interesting question is why there are still no evidence of such locks? Best regards, Denis Smirnov | Developer s...@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Re: log_autovacuum in Postgres 14 -- ordering issue
On Fri, Aug 27, 2021 at 12:55 PM Stephen Frost wrote: > > Okay. Plan is now to push these two patches together, later on. The > > second patch concerns this separate track_io_timing issue. It's pretty > > straightforward. > > > > (No change to the first patch in the series, relative to the v2 from > > earlier.) > > Looks alright to me. Pushed both patches -- thanks. -- Peter Geoghegan
Re: Queries that should be canceled will get stuck on secure_write function
Hi, On Fri, Aug 27, 2021, at 13:07, Robert Haas wrote: > On Fri, Aug 27, 2021 at 3:24 PM Andres Freund wrote: > > I wonder if we could improve the situation somewhat by using the > > non-blocking > > pqcomm functions in a few select places. E.g. if elog.c's > > send_message_to_frontend() sent its message via a new > > pq_endmessage_noblock() > > (which'd use the existing pq_putmessage_noblock()) and used > > pq_flush_if_writable() instead of pq_flush(), we'd a) not block sending to > > the > > client before AbortCurrentTransaction(), b) able to queue further error > > messages safely. > > pq_flush_if_writable() could succeed in sending only part of the data, > so I don't see how this works. All the data is buffered though, so I don't see that problem that causes? Andres
Re: Queries that should be canceled will get stuck on secure_write function
On Fri, Aug 27, 2021 at 3:24 PM Andres Freund wrote: > I wonder if we could improve the situation somewhat by using the non-blocking > pqcomm functions in a few select places. E.g. if elog.c's > send_message_to_frontend() sent its message via a new pq_endmessage_noblock() > (which'd use the existing pq_putmessage_noblock()) and used > pq_flush_if_writable() instead of pq_flush(), we'd a) not block sending to the > client before AbortCurrentTransaction(), b) able to queue further error > messages safely. pq_flush_if_writable() could succeed in sending only part of the data, so I don't see how this works. -- Robert Haas EDB: http://www.enterprisedb.com
Re: log_autovacuum in Postgres 14 -- ordering issue
Greetings, * Peter Geoghegan (p...@bowt.ie) wrote: > On Fri, Aug 27, 2021 at 12:30 PM Stephen Frost wrote: > > I don't particularly care for that explain rule, ultimately, but it's > > been around longer than I have and so I guess it wins. I'm fine with > > always showing the read/write for VACUUM and ANALYZE. > > > > Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me > > as something that should be consistent for all of these, but that's > > independent of this and I'm not going to stress over it, particularly > > since that's pre-existing. > > Okay. Plan is now to push these two patches together, later on. The > second patch concerns this separate track_io_timing issue. It's pretty > straightforward. > > (No change to the first patch in the series, relative to the v2 from earlier.) Looks alright to me. Thanks, Stephen signature.asc Description: PGP signature
Re: log_autovacuum in Postgres 14 -- ordering issue
On Fri, Aug 27, 2021 at 12:30 PM Stephen Frost wrote: > I don't particularly care for that explain rule, ultimately, but it's > been around longer than I have and so I guess it wins. I'm fine with > always showing the read/write for VACUUM and ANALYZE. > > Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me > as something that should be consistent for all of these, but that's > independent of this and I'm not going to stress over it, particularly > since that's pre-existing. Okay. Plan is now to push these two patches together, later on. The second patch concerns this separate track_io_timing issue. It's pretty straightforward. (No change to the first patch in the series, relative to the v2 from earlier.) -- Peter Geoghegan v3-0002-Show-zero-values-in-track_io_timing-log-output.patch Description: Binary data v3-0001-Reorder-log_autovacuum_min_duration-log-output.patch Description: Binary data
Re: RFC: Improve CPU cache locality of syscache searches
Hi, On 2021-08-19 19:10:37 -0400, John Naylor wrote: > I've made a small step in this direction in the attached. It uses a > template approach to generate type-specific SearchCatCache* functions, for > now only the 4-key ones. Since there's only a few of those, it's manageable > to invoke the template and change the call sites by hand. To get this to > scale up to all syscaches would require some scripting, but this much is > enough to get feedback on the approach. One possible concern in starting > down this path is that hundreds of call sites would have to be changed. (If > there's a good way around that, it hasn't occurred to me.) I was thinking of avoiding the need for that via a macro. I.e. have SearchSysCache4(AMOPSTRATEGY, ...) be mapped to SearchSysCache4AMOPSTRATEGY(...). That way callers wouldn't need to care, and we could continue to evolve the internals without continually doing large-scale code changes? > diff --git a/src/include/utils/catcache_search_template.h > b/src/include/utils/catcache_search_template.h > new file mode 100644 > index 00..6f5dc2573c > --- /dev/null > +++ b/src/include/utils/catcache_search_template.h > @@ -0,0 +1,176 @@ > +/*- > + * catcache_search_template.h > + * > + * A template for type-specialized SearchCatCache functions > + * > + * Usage notes: > + * > + * To generate functions specialized for a set of catcache keys, > + * the following parameter macros should be #define'd before this > + * file is included. > + * > + * - CC_SEARCH - the name of the search function to be generated > + * - CC_NKEYS - the number of search keys for the tuple > + * - FASTEQFUNCx (x in 1,2,3,4) - type-specific equality function(s) > + * - HASHFUNCx (x in 1,2,3,4) - type-specific hash function(s) It's not clear to me we need such a complicated interface here. Can't we just have a pg_attribute_always_inline function with a bunch more parameters? Greetings, Andres Freund
Re: Estimating HugePages Requirements?
Hi, On 2021-08-27 19:27:18 +, Bossart, Nathan wrote: > + > + --output-shmem > + > + > +Prints the amount of shared memory required for the given > +configuration and exits. This can be used on a running server, but > +the return value reflects the amount of shared memory needed based > +on the current invocation. It does not return the amount of shared > +memory in use by the running server. This must be the first > +argument on the command line. > + > + > + > +This option is useful for determining the number of huge pages > +needed for the server. For more information, see > +. > + > + > + > + One thing I wonder is if this wouldn't better be dealt with in a more generic way. While this is the most problematic runtime computed GUC, it's not the only one. What if we introduced a new shared_memory_size GUC, and made --describe-config output it? Perhaps adding --describe-config=guc-name? I also wonder if we should output the number of hugepages needed instead of the "raw" bytes of shared memory. The whole business about figuring out the huge page size, dividing the shared memory size by that and then rounding up could be removed in that case. Due to huge_page_size it's not even immediately obvious which huge page size one should use... > diff --git a/src/backend/main/main.c b/src/backend/main/main.c > index 3a2a0d598c..c141ae3d1c 100644 > --- a/src/backend/main/main.c > +++ b/src/backend/main/main.c > @@ -182,9 +182,11 @@ main(int argc, char *argv[]) >*/ > > if (argc > 1 && strcmp(argv[1], "--check") == 0) > - BootstrapModeMain(argc, argv, true); > + BootstrapModeMain(argc, argv, true, false); > + else if (argc > 1 && strcmp(argv[1], "--output-shmem") == 0) > + BootstrapModeMain(argc, argv, false, true); > else if (argc > 1 && strcmp(argv[1], "--boot") == 0) > - BootstrapModeMain(argc, argv, false); > + BootstrapModeMain(argc, argv, false, false); > #ifdef EXEC_BACKEND > else if (argc > 1 && strncmp(argv[1], "--fork", 6) == 0) > SubPostmasterMain(argc, argv); help() needs an update too. > diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c > index 3e4ec53a97..b225b1ee70 100644 > --- a/src/backend/storage/ipc/ipci.c > +++ b/src/backend/storage/ipc/ipci.c > @@ -75,6 +75,87 @@ RequestAddinShmemSpace(Size size) > total_addin_request = add_size(total_addin_request, size); > } > > +/* > + * CalculateShmemSize > + * Calculates the amount of shared memory and number of semaphores > needed. > + * > + * If num_semaphores is not NULL, it will be set to the number of semaphores > + * required. > + * > + * Note that this function freezes the additional shared memory request size > + * from loadable modules. > + */ > +Size > +CalculateShmemSize(int *num_semaphores) > +{ Can you split this into a separate commit? It feels fairy uncontroversial to me, so I think we could just apply it soon? Greetings, Andres Freund
\dP and \dX use ::regclass without "pg_catalog."
I noticed that for \dP+ since 1c5d9270e, regclass is written without "pg_catalog." (Alvaro and I failed to notice it in 421a2c483, too). + if (showNested || pattern) + appendPQExpBuffer(&buf, + ",\n c3.oid::regclass as \"%s\"", + gettext_noop("Parent name")); + + if (showIndexes) + appendPQExpBuffer(&buf, + ",\n c2.oid::regclass as \"%s\"", + gettext_noop("On table")); \dX is new in v14, and introduced the same issue in ad600bba0 (and modifies it but not fixed in a4d75c86). I searched for issues like this, which finds all 4 errors with 1 false positive in psql/describe.c |time grep -wF "$(grep -oE 'pg_catalog\.[_[:alpha:]]+' src/bin/psql/describe.c |sed -r 's/^pg_catalog\.//; /^(char|oid|text|trigger)$/d' )" src/bin/psql/describe.c |grep -Ev 'pg_catalog\.|^ *[/ ]\*' |#include "catalog/pg_am.h" | ",\n inh.inhparent::regclass as \"%s\"", | ",\n c2.oid::regclass as \"%s\"", | " es.stxrelid::regclass) AS \"%s\"", | "es.stxrelid::regclass) AS \"%s\"", Tom informs me that this is not considered to be interesting as a security patch. -- Justin >From 70eb6d65084104fa54965a349474cda8db6ed90d Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Fri, 27 Aug 2021 13:52:39 -0500 Subject: [PATCH 1/2] psql \dX: reference regclass with "pg_catalog." prefix Should backpatch to v14 See similar issue fixed at 1d21ba8a9b8cb784f927a2c9c5818f8ff6779c0b. --- src/bin/psql/describe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 8333558bda..772d881c2f 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -4735,7 +4735,7 @@ listExtendedStats(const char *pattern) appendPQExpBuffer(&buf, "pg_catalog.format('%%s FROM %%s', \n" " pg_get_statisticsobjdef_columns(es.oid), \n" - " es.stxrelid::regclass) AS \"%s\"", + " es.stxrelid::pg_catalog.regclass) AS \"%s\"", gettext_noop("Definition")); else appendPQExpBuffer(&buf, @@ -4746,7 +4746,7 @@ listExtendedStats(const char *pattern) " ON (es.stxrelid = a.attrelid \n" " AND a.attnum = s.attnum \n" " AND NOT a.attisdropped)), \n" - "es.stxrelid::regclass) AS \"%s\"", + "es.stxrelid::pg_catalog.regclass) AS \"%s\"", gettext_noop("Definition")); appendPQExpBuffer(&buf, -- 2.17.0 >From 9e243f51a095ec2ca5949d711483be2546e46d0e Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Fri, 27 Aug 2021 13:52:15 -0500 Subject: [PATCH 2/2] psql \dP: reference regclass with "pg_catalog." prefix Should backpatch to v12 --- src/bin/psql/describe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 772d881c2f..30fb17123e 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -4275,12 +4275,12 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose) if (showNested || pattern) appendPQExpBuffer(&buf, - ",\n inh.inhparent::regclass as \"%s\"", + ",\n inh.inhparent::pg_catalog.regclass as \"%s\"", gettext_noop("Parent name")); if (showIndexes) appendPQExpBuffer(&buf, - ",\n c2.oid::regclass as \"%s\"", + ",\n c2.oid::pg_catalog.regclass as \"%s\"", gettext_noop("Table")); if (verbose) -- 2.17.0
Re: log_autovacuum in Postgres 14 -- ordering issue
Greetings, * Peter Geoghegan (p...@bowt.ie) wrote: > On Fri, Aug 27, 2021 at 11:35 AM Stephen Frost wrote: > > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > > > looks like this: > > > > > > I/O timings: > > > > > > i.e., it doesn't show anything at all after the colon. > > > Reporting zeros seems > > valuable to me in that it shows that we did actually track the io timing > > and there simply wasn't any time spent doing that- if we didn't include > > the line at all then it wouldn't be clear if there wasn't any time spent > > in i/o or if track io timing wasn't enabled. > > The principle that we don't show things that are all-zeroes is unique > to text-format EXPLAIN output -- any other EXPLAIN format doesn't > treat all-zeroes as a special case. And so the most consistent and > correct thing seems to be this: show both all-zero "read:" and > "write:" (both in vacuumlazy.c and in analyze.c), without making any > other changes (i.e., no changes to EXPLAIN output are needed). I suppose. > You seem to be almost sold on that plan anyway. But this text format > EXPLAIN rule seems like it decides the question for us. I don't particularly care for that explain rule, ultimately, but it's been around longer than I have and so I guess it wins. I'm fine with always showing the read/write for VACUUM and ANALYZE. Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me as something that should be consistent for all of these, but that's independent of this and I'm not going to stress over it, particularly since that's pre-existing. Thanks, Stephen signature.asc Description: PGP signature
Re: [PATCH] Disable bgworkers during servers start in pg_upgrade
Hi, On 2021-08-27 09:34:24 +0800, Julien Rouhaud wrote: > On Fri, Aug 27, 2021 at 7:31 AM Michael Paquier wrote: > > > > Indeed, there is some history here with autovacuum. I have not been > > careful enough to check that. Still, putting a check on > > IsBinaryUpgrade in bgworker_should_start_now() would mean that we > > still keep track of the set of bgworkers registered in shared memory. > > That shouldn't lead to any problem right? > > > Wouldn't it be better to block things at the source, as of > > RegisterBackgroundWorker()? And that would keep track of the control > > we have on bgworkers in a single place. I also think that we'd better > > document something about that either in bgworker.sgml or pg_upgrade's > > page. > > I'm fine with that approach too. Isn't that just going to end up with extension code erroring out and/or blocking waiting for a bgworker to start? Greetings, Andres Freund
Re: Queries that should be canceled will get stuck on secure_write function
Hi, On 2021-08-27 08:27:38 -0400, Robert Haas wrote: > On Tue, Aug 24, 2021 at 9:58 PM Fujii Masao > wrote: > > to report an error to a client, and then calls AbortCurrentTransaction() > > to abort the transaction. If secure_write() called by EmitErrorReport() > > gets stuck, a backend gets stuck inside transaction block and the locks > > keep being held unnecessarily. Isn't this problematic? Can we change > > the order of them? > ... > More generally, I think it's hopeless to try to improve the situation > very much by changing the place where secure_write() happens. It > happens in so many different places, and it is clearly not going to be > possible to make it so that in none of those places do we hold any > important server resources. Therefore I think the only solution is to > fix secure_write() itself ... and the trick is what to do there given > that we have to be very careful not to do anything that might try to > write another message while we are already in the middle of writing a > message. I wonder if we could improve the situation somewhat by using the non-blocking pqcomm functions in a few select places. E.g. if elog.c's send_message_to_frontend() sent its message via a new pq_endmessage_noblock() (which'd use the existing pq_putmessage_noblock()) and used pq_flush_if_writable() instead of pq_flush(), we'd a) not block sending to the client before AbortCurrentTransaction(), b) able to queue further error messages safely. I think this'd not violate the goal of putting pq_flush() into send_message_to_frontend(): /* * This flush is normally not necessary, since postgres.c will flush out * waiting data when control returns to the main loop. But it seems best * to leave it here, so that the client has some clue what happened if the * backend dies before getting back to the main loop ... error/notice * messages should not be a performance-critical path anyway, so an extra * flush won't hurt much ... */ pq_flush(); because the only situations where we'd not send the data out immediately would be when the socket buffer is already full. In which case the client wouldn't get the error immediately anyway? Greetings, Andres Freund
Re: log_autovacuum in Postgres 14 -- ordering issue
On Fri, Aug 27, 2021 at 11:35 AM Stephen Frost wrote: > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > > looks like this: > > > > I/O timings: > > > > i.e., it doesn't show anything at all after the colon. > Reporting zeros seems > valuable to me in that it shows that we did actually track the io timing > and there simply wasn't any time spent doing that- if we didn't include > the line at all then it wouldn't be clear if there wasn't any time spent > in i/o or if track io timing wasn't enabled. The principle that we don't show things that are all-zeroes is unique to text-format EXPLAIN output -- any other EXPLAIN format doesn't treat all-zeroes as a special case. And so the most consistent and correct thing seems to be this: show both all-zero "read:" and "write:" (both in vacuumlazy.c and in analyze.c), without making any other changes (i.e., no changes to EXPLAIN output are needed). You seem to be almost sold on that plan anyway. But this text format EXPLAIN rule seems like it decides the question for us. -- Peter Geoghegan
Re: Queries that should be canceled will get stuck on secure_write function
Hi, On 2021-08-23 10:13:03 -0400, Robert Haas wrote: > On Mon, Aug 23, 2021 at 4:15 AM 蔡梦娟(玊于) wrote: > > I want to know why the interrupt is only handled when ProcDiePending is > > true, I think query which is supposed to be canceled also should respond to > > the signal. > > Well, if we're halfway through sending a message to the client and we > abort the write, we have no way of re-establishing protocol sync, > right? It's OK to die without sending any more data to the client, > since then the connection is closed anyway and the loss of sync > doesn't matter, but continuing the session doesn't seem workable. Right. > Your proposed patch actually seems to dodge this problem and I think > perhaps we could consider something along those lines. It would be > interesting to hear what Andres thinks about this idea, since I think > he was the last one to rewrite that code. I think it's a reasonable idea. I have some quibbles with the implementation (new code should be in ProcessClientWriteInterrupt(), not secure_write()), and I suspect we should escalate more quickly to killing the connection, but those seem like details. I think that if we want to tackle this, we need to do the same for secure_read() as well. secure_read() will process interrupts normally while DoingCommandRead, but if we're already in a command, we'd just as well be prevented from processing a !ProcDiePending interrupt. Greetings, Andres Freund
Re: log_autovacuum in Postgres 14 -- ordering issue
On Thu, Aug 26, 2021 at 10:28 PM Peter Geoghegan wrote: > I'll commit this in a day or two, backpatching to 14. Barring any objections. Actually, we also need to make the corresponding lines for ANALYZE follow the same convention -- those really should be consistent. As in the attached revision. I haven't tried to address the issue with "I/O timings:" that I just brought to Stephen's attention. We can handle that question separately. -- Peter Geoghegan v2-0001-Reorder-log_autovacuum_min_duration-log-output.patch Description: Binary data
Re: Estimating HugePages Requirements?
On 2021-08-27 16:40:27 +0200, Magnus Hagander wrote: > On Fri, Aug 27, 2021 at 8:46 AM Michael Paquier wrote: > I'd say a lot more than just handy. I don't think the workaround is > really all that useful. +1
Re: log_autovacuum in Postgres 14 -- ordering issue
Greetings, * Peter Geoghegan (p...@bowt.ie) wrote: > On Wed, Aug 25, 2021 at 2:07 PM Stephen Frost wrote: > > I generally like the idea though I'm not sure about changing things in > > v13 as there's likely code out there that's already parsing that data > > and it might suddenly break if this was changed. > > Agreed -- I won't backpatch anything to v13. Ok. > > Given that such code would need to be adjusted for v14 anyway, I don't > > really see changing it in v14 as as being an issue (nor do I feel that > > it's even a big concern at this point in the release cycle, though > > perhaps others feel differently). > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > looks like this: > > I/O timings: > > i.e., it doesn't show anything at all after the colon. This happens > because the instrumentation indicates that no time was spent on either > read I/O or write I/O. Hrmpf. That's an interesting point. > I now wonder if we should just unconditionally report both things > (both "read:" and "write:"), without regard for whether or not they're > non-zero. (We'd do the same thing with ANALYZE's equivalent code too, > if we actually did this -- same issue there.) So, it was done that way to match how we report I/O Timings from explain analyze, around src/backend/commands/explain.c:3574 (which I note is now slightly different from what VACUUM/ANALYZE do due to f4f4a64...). The intent was to be consistent in all of these places and I generally still feel that's a worthwhile thing to strive for. I don't have any particular problem with just always reporting it. Sure looks odd to have the line there w/o anything after it. Perhaps we should also address that in the explain analyze case though, and make the same changes there that were done in f4f4a64? Reporting zeros seems valuable to me in that it shows that we did actually track the io timing and there simply wasn't any time spent doing that- if we didn't include the line at all then it wouldn't be clear if there wasn't any time spent in i/o or if track io timing wasn't enabled. Thanks, Stephen signature.asc Description: PGP signature
Re: [PATCH] Proof of concept for GUC improvements
On Fri, Aug 27, 2021 at 1:19 AM Michael Paquier wrote: > > On Thu, Aug 19, 2021 at 03:58:57PM -0700, David G. Johnston wrote: > > I'm at -0.5 as to whether such a patch would actually be an improvement or > > whether the added possibilities would just be confusing and, because it is > > all optional, indefinitely so. > > FWIW, I find this proposition of introducing a set of optional > synonyms to map with some special-case values we have in the > configurations a bit confusing, as that's basically introducing > enum-like options into GUCs that already have a type assigned. It does use enum-like mappings, but that is just because I needed to tie together name + value and just reused the already similar data structure. That could be changed if the code itself is less understandable based on the struct names. > The patch, with its set of options like special_disabled0, > special_disabled_all is not really easy to parse either so that's just > a recipe to make the set of synonyms to grow on an GUC-basis. Yes, when I started out on this, I expected maybe 2-3 different interpretations at most, with more common overlap. I am not tied to making *every* GUC support this; maybe we support the special_disabled or special_disabled0 with differing names. > What I am wondering, though, is if there are cases in the existing > GUCs, with their existing types, where the situation of a default or > disabled value could be improved, though, to make the overall picture > more consistent. I think this would be possible, although the benefit of what I've written is that it doesn't change the interpretation of the value anywhere else, just in GUC parsing (and optionally GUC display). The parsing was where I felt this improved understanding, I'm less tied to outputting in the "canonical" way.
Re: log_autovacuum in Postgres 14 -- ordering issue
On Wed, Aug 25, 2021 at 2:07 PM Stephen Frost wrote: > I generally like the idea though I'm not sure about changing things in > v13 as there's likely code out there that's already parsing that data > and it might suddenly break if this was changed. Agreed -- I won't backpatch anything to v13. > Given that such code would need to be adjusted for v14 anyway, I don't > really see changing it in v14 as as being an issue (nor do I feel that > it's even a big concern at this point in the release cycle, though > perhaps others feel differently). BTW, I noticed one thing about the track_io_time stuff. Sometimes it looks like this: I/O timings: i.e., it doesn't show anything at all after the colon. This happens because the instrumentation indicates that no time was spent on either read I/O or write I/O. I now wonder if we should just unconditionally report both things (both "read:" and "write:"), without regard for whether or not they're non-zero. (We'd do the same thing with ANALYZE's equivalent code too, if we actually did this -- same issue there.) -- Peter Geoghegan
Re: Fwd: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4
Andrew Dunstan writes: > First, this apparently only affects build done with NLS. And in fact > even on release 11 the performance is much better when run on a non-NLS > build. So there's lots of work to do here. Wow ... it would not have occurred to me to check that. Testing that angle using HEAD on Linux (RHEL8), here are times I see for the OP's slow query: Non-NLS build, C locale: Time: 12452.062 ms (00:12.452) NLS build, en_US.utf8 locale: Time: 13596.114 ms (00:13.596) NLS build, after SET lc_messages TO 'es_ES.utf8': Time: 15190.689 ms (00:15.191) So there is a cost for translating the error messages on Linux too, but it's not nearly as awful as on Windows. I wonder if this boils down to a performance bug in the particular gettext version you're using? regards, tom lane
Re: [PATCH] pgbench: add multiconnect option
> >> Good. I was thinking of adding such capability, possibly for handling > >> connection errors and reconnecting… > > > > round-robin and random make sense. I am wondering how round-robin > > would work with -C, though? Would you just reuse the same connection > > string as the one chosen at the starting point. > > Well, not necessarily, but this is debatable. My expectation for such a behavior would be that it would reconnect to a random connstring each time, otherwise what's the point of using this with -C? If we needed to forbid some option combinations that is also an option. > >> I was thinking of providing a allowing a list of conninfo strings with > >> repeated options, eg --conninfo "foo" --conninfo "bla"… > > > > That was my first thought when reading the subject of this thread: > > create a list of connection strings and pass one of them to > > doConnect() to grab the properties looked for. That's a bit confusing > > though as pgbench does not support directly connection strings, > > They are supported because libpq silently assumes that "dbname" can be a > full connection string. > > > and we should be careful to keep fallback_application_name intact. > > Hmmm. See attached patch, ISTM that it does the right thing. I guess the multiple --conninfo approach is fine; I personally liked having the list come from a file, as you could benchmark different groups/clusters based on a file, much easier than constructing multiple pgbench invocations depending. I can see an argument for both approaches. The PGSERVICEFILE was an idea I'd had to store easily indexed groups of connection information in a way that I didn't need to know all the details, could easily parse, and could later pass in the ENV so libpq could just pull out the information. > >> Your approach using PGSERVICEFILE also make sense! > > > > I am not sure that's actually needed here, as it is possible to pass > > down a service name within a connection string. I think that you'd > > better leave libpq do all the work related to a service file, if > > specified. pgbench does not need to know any of that. > > Yes, this is an inconvenient with this approach, part of libpq machinery > is more or less replicated in pgbench, which is quite annoying, and less > powerful. There is some small fraction reproduced here just to pull out the named sections; no other parsing should be done though. > Attached my work-in-progress version, with a few open issues (eg probably > not thread safe), but comments about the provided feature are welcome. > > I borrowed the "strategy" option, renamed policy, from the initial patch. > Pgbench just accepts several connection strings as parameters, eg: > >pgbench ... "service=db1" "service=db2" "service=db3" > > The next stage is to map scripts to connections types and connections > to connection types, so that pgbench could run W transactions against a > primary and R transactions agains a hot standby, for instance. I have a > some design for that, but nothing is implemented. > > There is also the combination with the error handling patch to consider: > if a connection fails, a connection to a replica could be issued instead. I'll see if I can take a look at your latest patch. I was also wondering about how we should handle `pgbench -i` with multiple connection strings; currently it would only initialize with the first DSN it gets, but it probably makes sense to run initialize against all of the databases (or at least attempt to). Maybe this is one argument for the multiple --conninfo handling, since you could explicitly pass the databases you want. (Not that it is hard to just loop over connection info and `pgbench -i` with ENV, or any other number of ways to accomplish the same thing.) Best, David
Fwd: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4
Somehow -hackers got left off the cc: On 8/22/21 6:11 PM, Andrew Dunstan wrote: > On 8/22/21 5:59 PM, l...@laurent-hasson.com wrote: >> > -Original Message- >> > From: Andrew Dunstan >> > Sent: Sunday, August 22, 2021 17:27 >> > To: Tom Lane ; l...@laurent-hasson.com >> > Cc: Justin Pryzby ; Ranier Vilela >> > ; pgsql-performa...@postgresql.org >> > Subject: Re: Big Performance drop of Exceptions in UDFs between V11.2 >> > and 13.4 >> > > > On 8/22/21 4:11 PM, Tom Lane wrote: >> > > "l...@laurent-hasson.com" writes: >> > >> I do have a Linux install of 13.3, and things work beautifully, >> so this is >> > definitely a Windows thing here that started in V12. >> > > It's good to have a box around it, but that's still a pretty >> large box >> > > :-(. >> > > >> > > I'm hoping that one of our Windows-using developers will see if they >> > > can reproduce this, and if so, try to bisect where it started. >> > > Not sure how to make further progress without that. >> > > >> > > >> > > > Can do. Assuming the assertion that it started in Release 12 is >> correct, I >> > should be able to find it by bisecting between the branch point for 12 >> > and the tip of that branch. That's a little over 20 probes by my >> > calculation. >> > > > cheers >> > > > andrew >> > > > -- >> > Andrew Dunstan >> > EDB: https://www.enterprisedb.com >> >> >> I tried it on 11.13 and 12.3. Is there a place where I could download >> 12.1 and 12.2 and test that? Is it worth it or you think you have all >> you need? >> > > I think I have everything I need. > > > Step one will be to verify that the difference exists between the branch > point and the tip of release 12. Once that's done it will be a matter of > probing until the commit at fault is identified. > OK, here's what we know. First, this apparently only affects build done with NLS. And in fact even on release 11 the performance is much better when run on a non-NLS build. So there's lots of work to do here. I can't yet pinpoint the place where it got disastrously bad, because I can't build with VS2017 back past commit a169155453 on the REL_13_STABLE branch. That commit fixed an issue with VS2015 and newer. The machine that runs bowerbird has some older VS installations, and choco has vs2013 packages, so there are opportunities to explore further. I'll get back to this in a couple of days. Thanks to my EDB colleagues Sandeep Thakkar and Tushar Ahuja for helping to identify the cause of the issue. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] pgbench: add multiconnect option
Bonjour Michaël, Good. I was thinking of adding such capability, possibly for handling connection errors and reconnecting… round-robin and random make sense. I am wondering how round-robin would work with -C, though? Would you just reuse the same connection string as the one chosen at the starting point. Well, not necessarily, but this is debatable. I was thinking of providing a allowing a list of conninfo strings with repeated options, eg --conninfo "foo" --conninfo "bla"… That was my first thought when reading the subject of this thread: create a list of connection strings and pass one of them to doConnect() to grab the properties looked for. That's a bit confusing though as pgbench does not support directly connection strings, They are supported because libpq silently assumes that "dbname" can be a full connection string. and we should be careful to keep fallback_application_name intact. Hmmm. See attached patch, ISTM that it does the right thing. Your approach using PGSERVICEFILE also make sense! I am not sure that's actually needed here, as it is possible to pass down a service name within a connection string. I think that you'd better leave libpq do all the work related to a service file, if specified. pgbench does not need to know any of that. Yes, this is an inconvenient with this approach, part of libpq machinery is more or less replicated in pgbench, which is quite annoying, and less powerful. Attached my work-in-progress version, with a few open issues (eg probably not thread safe), but comments about the provided feature are welcome. I borrowed the "strategy" option, renamed policy, from the initial patch. Pgbench just accepts several connection strings as parameters, eg: pgbench ... "service=db1" "service=db2" "service=db3" The next stage is to map scripts to connections types and connections to connection types, so that pgbench could run W transactions against a primary and R transactions agains a hot standby, for instance. I have a some design for that, but nothing is implemented. There is also the combination with the error handling patch to consider: if a connection fails, a connection to a replica could be issued instead. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 0c60077e1f..7b99344c90 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -29,7 +29,7 @@ PostgreSQL documentation pgbench option - dbname + dbname or conninfo @@ -160,6 +160,9 @@ pgbench options d not specified, the environment variable PGDATABASE is used. If that is not set, the user name specified for the connection is used. +Alternatively, the dbname can be +a standard connection information string. +Several connections can be provided. @@ -840,6 +843,21 @@ pgbench options d + + --connection-policy=policy + + +Set the connection policy when multiple connections are available. +Default is round-robin provided (ro). +Possible values are: +first (f), +random (ra), +round-robin (ro), +working (w). + + + + -h hostname --host=hostname diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index b0e20c46ae..95e58f0573 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -277,13 +277,39 @@ bool is_connect; /* establish connection for each transaction */ bool report_per_command; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ +char *logfile_prefix = NULL; + +/* main connection definition */ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; -const char *dbName = NULL; -char *logfile_prefix = NULL; const char *progname; +/* multi connections */ +typedef enum mc_policy_t +{ + MC_UNKNOWN = 0, + MC_FIRST, + MC_RANDOM, + MC_ROUND_ROBIN, + MC_WORKING +} mc_policy_t; + +/* connection info list */ +typedef struct connection_t +{ + const char *connection; /* conninfo or dbname */ + int errors; /* number of connection errors */ +} connection_t; + +static intn_connections = 0; +static connection_t *connections = NULL; +static mc_policy_t mc_policy = MC_ROUND_ROBIN; + +/* last used connection */ +// FIXME per thread? +static int current_connection = 0; + #define WSEP '@'/* weight separator */ volatile bool timer_exceeded = false; /* flag from signal handler */ @@ -701,7 +727,7 @@ usage(void) { printf("%s is a benchmarking tool for PostgreSQL.\n\n" "Usage:\n" - " %s [OPTION]... [DBNAME]\n" + " %s [OPTION]... [DBNAME or CONNINFO ...]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" @
Re: Avoid repeated PQfnumber() in pg_dump
On 8/27/21, 7:27 AM, "Daniel Gustafsson" wrote: > I took another look at this today and pushed it after verifying with a > pgindent > run. Thanks! Thank you! Nathan
Re: Partition Check not updated when insert into a partition
I have reviewed the patch and it looks good to me. However I have one comment. + foreach(l, attachrel_children) + { + Oid partOid = lfirst_oid(l); + + CacheInvalidateRelcacheByRelid(partOid); + } Can we avoid using the extra variable 'partOid' and directly pass 'lfirst_oid(l)' to CacheInvalidateRelcacheByRelid(). Thanks & Regards, Nitin Jadhav On Fri, Aug 27, 2021 at 2:50 PM Amit Langote wrote: > > On Thu, Aug 5, 2021 at 11:32 AM Amit Langote wrote: > > On Wed, Jul 14, 2021 at 11:16 AM houzj.f...@fujitsu.com > > > On Tuesday, July 13, 2021 2:52 AM Alvaro Herrera > > > wrote: > > > > Did you have a misbehaving test for the ATTACH case? > > > > > > Thanks for the response. > > > > Thank you both. > > > > > Yes, I think the following example of ATTACH doesn't work as expected. > > > > Yeah, need the fix for the ATTACH case too. > > > > Couple more things: > > > > * We must invalidate not just the "direct" partitions of the table > > being attached/detached, but also any indirect ones, because all of > > their partition constraints would need to contain (or no longer > > contain) the root parent's partition constraint. > > > > * I think we should lock the partitions before sending the > > invalidation. The ATTACH code already locks the descendents for a > > different purpose, but DETACH doesn't, so the latter needs to be fixed > > to match. > > > > I've updated Alvaro's patch to address these points. Maybe, we should > > also add these cases to the regression and isolation suites? > > Apparently, I had posted a version of the patch that didn't even compile. > > I have fixed that in the attached and also added regression tests. > > Adding this to the next CF. > > -- > Amit Langote > EDB: http://www.enterprisedb.com
Re: Estimating HugePages Requirements?
On Fri, Aug 27, 2021 at 8:46 AM Michael Paquier wrote: > > On Wed, Aug 11, 2021 at 11:23:52PM +, Bossart, Nathan wrote: > > I think BootstrapModeMain() makes the most sense. It fits in nicely > > with the --check logic that's already there. With v3, the following > > command can be used to retrieve the amount of shared memory required. > > > > postgres --output-shmem -D dir > > > > While testing this new option, I noticed that you can achieve similar > > results today with the following command, although this one will > > actually try to create the shared memory, too. > > That may not be the best option. I would say that can be a disastrous option. First of all it would probably not work if you already have something running -- especially when using huge pages. And if it does work, in that or other scenarios, it can potentially have significant impact on a running cluster to suddenly allocate many GB of more memory... > > IMO the new option is still handy, but I can see the argument that it > > might not be necessary. > > A separate option looks handy. Wouldn't it be better to document it > in postgres-ref.sgml then? I'd say a lot more than just handy. I don't think the workaround is really all that useful. (haven't looked at the actual patch yet, just commenting on the principle) -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Re: Avoid repeated PQfnumber() in pg_dump
> On 23 Jul 2021, at 01:39, Bossart, Nathan wrote: > The patch looks good to me. I am marking it as ready-for-committer. I took another look at this today and pushed it after verifying with a pgindent run. Thanks! -- Daniel Gustafsson https://vmware.com/
Re: [PATCH] Allow multiple recursive self-references
I tested the following query (from SQLite documentation): CREATE TABLE edge(aa INT, bb INT); WITH RECURSIVE nodes(x) AS ( SELECT 59 UNION SELECT aa FROM edge JOIN nodes ON bb=x UNION SELECT bb FROM edge JOIN nodes ON aa=x ) SELECT x FROM nodes; ERROR: 42P19: recursive reference to query "nodes" must not appear within its non-recursive term LINE 4:SELECT aa FROM edge JOIN nodes ON bb=x ^ LOCATION: checkWellFormedRecursionWalker, parse_cte.c:960 This well-formedness check apparently needs to be enhanced to allow for more than two branches in the union.
Re: [PATCH] Add OAUTH2_SCOPE variable for scope configuration
> On 27 Aug 2021, at 14:15, Nico Rikken wrote: > I haven't yet tested this, as I'm still in the process of setting up a local > development environment. I hope somebody else here can help me with the > quality > assurance. This is the mailinglist for the core postgres server, for pgadmin development please see the below URL for an appropriate list: https://www.pgadmin.org/support/list/ I’m sure someone there will be able to help. -- Daniel Gustafsson https://vmware.com/
[PATCH] Add OAUTH2_SCOPE variable for scope configuration
In certain cases like with OpenID Connect, a different scope is needed. This patch adds an additional variable `OAUTH2_SCOPE` that can be used to configure the appropriate scope for the deployment. Already there are runtime checks to ensure that the email claim is included in the user profile, so there is no need for similar checks on the configuration. This commit does introduce a check in the oauth2.py if a value for OAUTH2_SCOPE is set, to prevent a breaking change. Related issue: https://redmine.postgresql.org/issues/6627 OIDC docs: https://openid.net/specs/openid-connect-core-1_0.html#ScopeClaims I haven't yet tested this, as I'm still in the process of setting up a local development environment. I hope somebody else here can help me with the quality assurance. Signed-off-by: Nico Rikken --- docs/en_US/oauth2.rst | 1 + web/config.py | 3 +++ web/pgadmin/authenticate/oauth2.py| 6 +- web/pgadmin/browser/tests/test_oauth2_with_mocking.py | 1 + 4 files changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/en_US/oauth2.rst b/docs/en_US/oauth2.rst index 8947b509e..4cc2628f5 100644 --- a/docs/en_US/oauth2.rst +++ b/docs/en_US/oauth2.rst @@ -30,6 +30,7 @@ and modify the values for the following parameters: "OAUTH2_AUTHORIZATION_URL", "Endpoint for user authorization" "OAUTH2_API_BASE_URL", "Oauth2 base URL endpoint to make requests simple, ex: *https://api.github.com/*"; "OAUTH2_USERINFO_ENDPOINT", "User Endpoint, ex: *user* (for github) and *useinfo* (for google)" +"OAUTH2_SCOPE", "Oauth scope, ex: 'openid email profile'. Note that an 'email' claim is required in the resulting profile." "OAUTH2_ICON", "The Font-awesome icon to be placed on the oauth2 button, ex: fa-github" "OAUTH2_BUTTON_COLOR", "Oauth2 button color" "OAUTH2_AUTO_CREATE_USER", "Set the value to *True* if you want to automatically diff --git a/web/config.py b/web/config.py index d797e26f7..e932d17fc 100644 --- a/web/config.py +++ b/web/config.py @@ -711,6 +711,9 @@ OAUTH2_CONFIG = [ # Name of the Endpoint, ex: user 'OAUTH2_USERINFO_ENDPOINT': None, # Font-awesome icon, ex: fa-github +'OAUTH2_SCOPE': None, +# Oauth scope, ex: 'openid email profile' +# Note that an 'email' claim is required in the resulting profile 'OAUTH2_ICON': None, # UI button colour, ex: #ff 'OAUTH2_BUTTON_COLOR': None, diff --git a/web/pgadmin/authenticate/oauth2.py b/web/pgadmin/authenticate/oauth2.py index 91903165a..5e60d35dd 100644 --- a/web/pgadmin/authenticate/oauth2.py +++ b/web/pgadmin/authenticate/oauth2.py @@ -104,7 +104,11 @@ class OAuth2Authentication(BaseAuthentication): access_token_url=oauth2_config['OAUTH2_TOKEN_URL'], authorize_url=oauth2_config['OAUTH2_AUTHORIZATION_URL'], api_base_url=oauth2_config['OAUTH2_API_BASE_URL'], -client_kwargs={'scope': 'email profile'} +# Resort to previously hardcoded scope 'email profile' in case +# no OAUTH2_SCOPE is provided. This prevents a breaking change. +client_kwargs={'scope': + oauth2_config.get('OAUTH2_SCOPE', + 'email profile')} ) def get_source_name(self): diff --git a/web/pgadmin/browser/tests/test_oauth2_with_mocking.py b/web/pgadmin/browser/tests/test_oauth2_with_mocking.py index b170720a8..71706ebe6 100644 --- a/web/pgadmin/browser/tests/test_oauth2_with_mocking.py +++ b/web/pgadmin/browser/tests/test_oauth2_with_mocking.py @@ -58,6 +58,7 @@ class Oauth2LoginMockTestCase(BaseTestGenerator): 'https://github.com/login/oauth/authorize', 'OAUTH2_API_BASE_URL': 'https://api.github.com/', 'OAUTH2_USERINFO_ENDPOINT': 'user', +'OAUTH2_SCOPE': 'email profile', 'OAUTH2_ICON': 'fa-github', 'OAUTH2_BUTTON_COLOR': '#3253a8', } -- 2.25.1
Re: Supply restore_command to pg_rewind via CLI argument
On Fri, Aug 27, 2021 at 10:05 AM Andrey Borodin wrote: > There is a small bug > + /* > +* Take restore_command from the postgresql.conf only if it is not > already > +* provided as a command line option. > +*/ > + if (!restore_wal && restore_command == NULL) > return; > > I think we should use condition (!restore_wal || restore_command != NULL). > Yes, you are right, thanks. Attached is a fixed version. Tests were passing since PostgresNode->enable_restoring is adding restore_command to the postgresql.conf anyway. > > Besides this patch looks good and is ready for committer IMV. > -- Alexey Kondratov v2-0001-Allow-providing-restore_command-as-a-command-line.patch Description: Binary data
Re: UNIQUE null treatment option
On Fri, Aug 27, 2021 at 3:38 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > In the SQL:202x draft, this > has been formalized by making this implementation-defined and adding > an option on unique constraint definitions UNIQUE [ NULLS [NOT] > DISTINCT ] to choose a behavior explicitly. > > The CREATE UNIQUE INDEX syntax extension is not from the standard, > it's my own invention. > For the unique index syntax, should this be selectable per column/expression, rather than for the entire index as a whole? .m
Re: Added schema level support for publication.
On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila wrote: > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C wrote: > > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C wrote: > > > > I have implemented this in the 0003 patch, I have kept it separate to > > reduce the testing effort and also it will be easier if someone > > disagrees with the syntax. I will merge it to the main patch later > > based on the feedback. Attached v22 patch has the changes for the > > same. > > > > Few comments on v22-0001-Added-schema-level-support-for-publication: > > 1. Why in publication_add_schema(), you are registering invalidation > for all schema relations? It seems this is to allow rebuilding the > publication info for decoding sessions. But that is not required as > you are registering rel_sync_cache_publication_cb for > publication_schema relation. In rel_sync_cache_publication_cb, we are > marking replicate_valid as false for each entry which will allow > publication info to be rebuilt in get_rel_sync_entry. > > I see that it is done for a single relation in the current code in > function publication_add_relation but I guess that is also not > required. You can once test this. If you still think it is required, > can you please explain me why and then we can accordingly add some > comments in the patch. I felt this is required for handling the following concurrency scenario: create schema sch1; create table sch1.t1(c1 int); insert into sch1.t1 values(10); update sch1.t1 set c1 = 11; # update will be successful and relation cache will update publication actions based on the current state i.e no publication create publication pub1 for all tables in schema sch1; # After the above publication is created the relations present in this schema should be invalidated so that the next update should fail. If the relations are not invalidated the updates will be successful based on previous publication actions. update sch1.t1 set c1 = 11; I will add comments to mention the above details. Thoughts? > Peter E., Sawada-San, can you please let me know if I am missing > something in this regard? In the code, I see a comment "/* Invalidate > relcache so that publication info is rebuilt. */" in function > publication_add_relation() but I don't understand why it is required > as per my explanation above? > > 2. > + * Publication object type > + */ > +typedef enum PublicationObjSpecType > +{ > + PUBLICATIONOBJ_TABLE, /* Table type */ > + PUBLICATIONOBJ_SCHEMA, /* Schema type */ > + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */ > > Why add anything related to the sequence in this patch? I will handle this in my next version. > 3. > +get_object_address_publication_schema(List *object, bool missing_ok) > +{ > + ObjectAddress address; > + char*pubname; > + Publication *pub; > + char*schemaname; > + Oid schemaoid; > + > + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid); > + > + /* Fetch schema name and publication name from input list */ > + schemaname = strVal(linitial(object)); > + pubname = strVal(lsecond(object)); > + > + schemaoid = get_namespace_oid(schemaname, false); > + > + /* Now look up the pg_publication tuple */ > + pub = GetPublicationByName(pubname, missing_ok); > + if (!pub) > + return address; > > Why the schema name handling is different from publication name? Why > can't we pass missing_ok for schema api and handle it similar > publication api? I will handle this in my next version. > 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used > in get_publication_name() whereas it is used for all other syscache > searches in that function? I will handle this in my next version. > 5. Don't we need to expose a view for publication schemas similar to > pg_publication_tables? pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally access pg_publication_rel and pg_publication_schema to get the corresponding tables. I felt we don't need a separate view for publication schemas. Thoughts? > 6. > publication_add_schema() > { > .. > + /* Can't be system namespace */ > + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a system schema", > + get_namespace_name(schemaoid)), > + errdetail("System schemas cannot be added to publications."))); > + > + /* Can't be temporary namespace */ > + if (isAnyTempNamespace(schemaoid)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("\"%s\" is a temporary schema", > + get_namespace_name(schemaoid)), > + errdetail("Temporary schemas cannot be added to publications."))); > .. > } > > Can we change the first detail message as: "This operation is not > supported for system schemas." and the second detail message as: > "Temporary schemas cannot be replicated."? This is to make these > messages similar to c
UNIQUE null treatment option
The SQL standard has been ambiguous about whether null values in unique constraints should be considered equal or not. Different implementations have different behaviors. In the SQL:202x draft, this has been formalized by making this implementation-defined and adding an option on unique constraint definitions UNIQUE [ NULLS [NOT] DISTINCT ] to choose a behavior explicitly. This patch adds this option to PostgreSQL. The default behavior remains UNIQUE NULLS DISTINCT. Making this happen in the btree code is apparently pretty easy; most of the patch is just to carry the flag around to all the places that need it. The CREATE UNIQUE INDEX syntax extension is not from the standard, it's my own invention. (I named all the internal flags, catalog columns, etc. in the negative ("nulls not distinct") so that the default PostgreSQL behavior is the default if the flag is false. But perhaps the double negatives make some code harder to read.) From 14bd23b4f164c4298262e7fbfec1a49292d16e27 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 27 Aug 2021 14:31:46 +0200 Subject: [PATCH] Add UNIQUE null treatment option The SQL standard has been ambiguous about whether null values in unique constraints should be considered equal or not. Different implementations have different behaviors. In the SQL:202x draft, this has been formalized by making this implementation-defined and adding an option on unique constraint definitions UNIQUE [ NULLS [NOT] DISTINCT ] to choose a behavior explicitly. This patch adds this option to PostgreSQL. The default behavior remains UNIQUE NULLS DISTINCT. Making this happen in the btree code is pretty easy; most of the patch is just to carry the flag around to all the places that need it. The CREATE UNIQUE INDEX syntax extension is not from the standard, it's my own invention. XXX I named all the internal flags, catalog columns, etc. in the negative ("nulls not distinct") so that the default PostgreSQL behavior is the default if the flag is false. But perhaps the double negatives make some code harder to read. --- doc/src/sgml/catalogs.sgml | 13 ++ doc/src/sgml/ddl.sgml | 29 +--- doc/src/sgml/information_schema.sgml | 12 + doc/src/sgml/ref/alter_table.sgml | 4 +- doc/src/sgml/ref/create_index.sgml | 13 ++ doc/src/sgml/ref/create_table.sgml | 11 ++--- src/backend/access/nbtree/nbtinsert.c | 10 ++--- src/backend/access/nbtree/nbtsort.c| 15 ++- src/backend/catalog/index.c| 7 +++ src/backend/catalog/information_schema.sql | 9 +++- src/backend/catalog/sql_features.txt | 1 + src/backend/catalog/toasting.c | 1 + src/backend/commands/indexcmds.c | 3 +- src/backend/nodes/copyfuncs.c | 2 + src/backend/nodes/equalfuncs.c | 2 + src/backend/nodes/makefuncs.c | 3 +- src/backend/nodes/outfuncs.c | 2 + src/backend/parser/gram.y | 47 src/backend/parser/parse_utilcmd.c | 3 ++ src/backend/utils/adt/ruleutils.c | 23 +++--- src/backend/utils/cache/relcache.c | 1 + src/backend/utils/sort/tuplesort.c | 8 +++- src/bin/pg_dump/pg_dump.c | 9 +++- src/bin/pg_dump/pg_dump.h | 1 + src/bin/psql/describe.c| 30 ++--- src/include/catalog/pg_index.h | 1 + src/include/nodes/execnodes.h | 1 + src/include/nodes/makefuncs.h | 2 +- src/include/nodes/parsenodes.h | 2 + src/include/utils/tuplesort.h | 1 + src/test/regress/expected/create_index.out | 51 ++ src/test/regress/input/constraints.source | 14 ++ src/test/regress/output/constraints.source | 23 ++ src/test/regress/sql/create_index.sql | 36 +++ 34 files changed, 332 insertions(+), 58 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 2b2c70a26e..fd49738d4f 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -4256,6 +4256,19 @@ pg_index Columns + + + indnullsnotdistinct bool + + + This value is only used for unique indexes. If false, this unique + index will consider null values distinct (so the index can contain + multiple null values in a column, the default PostgreSQL behavior). If + it is true, it will consider null values to be equal (so the index can + only contain one null value in a column). + + + indisprimary bool diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index e0ffb020bf..815a2e23f9 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -759,14 +759,33 @@ Unique Constraints In general, a unique constraint is violated if there is more than on
Re: Queries that should be canceled will get stuck on secure_write function
On Tue, Aug 24, 2021 at 9:58 PM Fujii Masao wrote: > I was thinking that it's valid even if secure_write() doesn't react to > pg_cancel_backend() because it's basically called outside transaction block, > i.e., there is no longer transaction to cancel in that case. But there can be > some cases where secure_write() is called inside transaction block, > for example, when the query generates NOTICE message. In these cases, > secure_write() might need to react to the cancel request. Yeah. I think we could also be sending tuple data. > BTW, when an error happens, I found that a backend calls EmitErrorReport() > to report an error to a client, and then calls AbortCurrentTransaction() > to abort the transaction. If secure_write() called by EmitErrorReport() > gets stuck, a backend gets stuck inside transaction block and the locks > keep being held unnecessarily. Isn't this problematic? Can we change > the order of them? I think there might be problems with that, like perhaps the ErrorData object can have pointers into the memory contexts that we'd be destroying in AbortCurrentTransaction(). More generally, I think it's hopeless to try to improve the situation very much by changing the place where secure_write() happens. It happens in so many different places, and it is clearly not going to be possible to make it so that in none of those places do we hold any important server resources. Therefore I think the only solution is to fix secure_write() itself ... and the trick is what to do there given that we have to be very careful not to do anything that might try to write another message while we are already in the middle of writing a message. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Added schema level support for publication.
On Fri, Aug 27, 2021 at 11:43 AM vignesh C wrote: > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C wrote: > > I have implemented this in the 0003 patch, I have kept it separate to > reduce the testing effort and also it will be easier if someone > disagrees with the syntax. I will merge it to the main patch later > based on the feedback. Attached v22 patch has the changes for the > same. > Few comments on v22-0001-Added-schema-level-support-for-publication: 1. Why in publication_add_schema(), you are registering invalidation for all schema relations? It seems this is to allow rebuilding the publication info for decoding sessions. But that is not required as you are registering rel_sync_cache_publication_cb for publication_schema relation. In rel_sync_cache_publication_cb, we are marking replicate_valid as false for each entry which will allow publication info to be rebuilt in get_rel_sync_entry. I see that it is done for a single relation in the current code in function publication_add_relation but I guess that is also not required. You can once test this. If you still think it is required, can you please explain me why and then we can accordingly add some comments in the patch. Peter E., Sawada-San, can you please let me know if I am missing something in this regard? In the code, I see a comment "/* Invalidate relcache so that publication info is rebuilt. */" in function publication_add_relation() but I don't understand why it is required as per my explanation above? 2. + * Publication object type + */ +typedef enum PublicationObjSpecType +{ + PUBLICATIONOBJ_TABLE, /* Table type */ + PUBLICATIONOBJ_SCHEMA, /* Schema type */ + PUBLICATIONOBJ_SEQUENCE, /* Sequence type */ Why add anything related to the sequence in this patch? 3. +get_object_address_publication_schema(List *object, bool missing_ok) +{ + ObjectAddress address; + char*pubname; + Publication *pub; + char*schemaname; + Oid schemaoid; + + ObjectAddressSet(address, PublicationSchemaRelationId, InvalidOid); + + /* Fetch schema name and publication name from input list */ + schemaname = strVal(linitial(object)); + pubname = strVal(lsecond(object)); + + schemaoid = get_namespace_oid(schemaname, false); + + /* Now look up the pg_publication tuple */ + pub = GetPublicationByName(pubname, missing_ok); + if (!pub) + return address; Why the schema name handling is different from publication name? Why can't we pass missing_ok for schema api and handle it similar publication api? 4. In getPublicationSchemaInfo(), why the missing_ok flag is not used in get_publication_name() whereas it is used for all other syscache searches in that function? 5. Don't we need to expose a view for publication schemas similar to pg_publication_tables? 6. publication_add_schema() { .. + /* Can't be system namespace */ + if (IsCatalogNamespace(schemaoid) || IsToastNamespace(schemaoid)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is a system schema", + get_namespace_name(schemaoid)), + errdetail("System schemas cannot be added to publications."))); + + /* Can't be temporary namespace */ + if (isAnyTempNamespace(schemaoid)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is a temporary schema", + get_namespace_name(schemaoid)), + errdetail("Temporary schemas cannot be added to publications."))); .. } Can we change the first detail message as: "This operation is not supported for system schemas." and the second detail message as: "Temporary schemas cannot be replicated."? This is to make these messages similar to corresponding messages for relations in function check_publication_add_relation(). Can we move these checks to a separate function? -- With Regards, Amit Kapila.
Re: Skipping logical replication transactions on subscriber side
On Fri, Aug 27, 2021 at 1:36 PM Amit Kapila wrote: > > On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada wrote: > > > > On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila wrote: > > > > > > > > > Okay, changed accordingly. Additionally, I have changed the code which > > > sets timestamp to (unset) when it is 0 so that it won't display the > > > timestamp in that case. I have made few other cosmetic changes in the > > > attached patch. See and let me know what you think of it? > > > > Thank you for the patch! > > > > Agreed with these changes. The patch looks good to me. > > > > Pushed, feel free to rebase and send the remaining patch set. Thanks! I'll post the updated version patch on Monday. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Failure of subscription tests with topminnow
On Fri, Aug 27, 2021 at 3:29 PM Amit Kapila wrote: > > > I think the fix is correct but similar changes are required in > 022_twophase_cascade.pl as well (search for $oldpid in tests). I am > not completely sure but I think it is better to make this test change > in back branches as well to make it stable and reduce such random > build farm failures. I have made the changes in 022_twophase_cascade.pl for HEAD as well as patches for older branches. regards, Ajin Cherian Fujitsu Australia HEAD-v2-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data REL-13-STABLE-v1-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data REL-12-STABLE-v1-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data REL-10-STABLE-v1-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data REL-11-STABLE-v1-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data REL-14-STABLE-v1-0001-fix-for-tap-test-failure-seen-in-001_rep_changes.patch Description: Binary data
Re: Next Steps with Hash Indexes
On Fri, Aug 13, 2021 at 11:40 AM Dilip Kumar wrote: > > On Fri, Aug 13, 2021 at 9:31 AM Amit Kapila wrote: > > > > On Thu, Aug 12, 2021 at 8:30 PM Robert Haas wrote: > > > > > > On Thu, Aug 12, 2021 at 12:22 AM Amit Kapila > > > wrote: > > > > The design of the patch has changed since the initial proposal. It > > > > tries to perform unique inserts by holding a write lock on the bucket > > > > page to avoid duplicate inserts. > > > > > > Do you mean that you're holding a buffer lwlock while you search the > > > whole bucket? If so, that's surely not OK. > > > > > > > I think here you are worried that after holding lwlock we might > > perform reads of overflow pages which is not a good idea. I think > > there are fewer chances of having overflow pages for unique indexes so > > we don't expect such cases in common > > I think if we identify the bucket based on the hash value of all the > columns then there should be a fewer overflow bucket, but IIUC, in > this patch bucket, is identified based on the hash value of the first > column only so there could be a lot of duplicates on the first column. IMHO, as discussed above, since other databases also have the limitation that if you create a multi-column hash index then the hash index can not be used until all the key columns are used in the search condition. So my point is that users might be using the hash index with this limitation and their use case might be that they want to gain the best performance when they use this particular case and they might not be looking for much flexibility like we provide in BTREE. For reference: https://dev.mysql.com/doc/refman/8.0/en/index-btree-hash.html https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/indexes-for-memory-optimized-tables?view=sql-server-ver15 We already know that performance will be better with a single hash for multiple columns, but still I just wanted to check the performance difference in PG. This might help us to decide the approach we need to go for. With a quick POC of both the ideas, I have observed there is a major performance advantage with single combined hash for multi-key columns. Performance Test Details: (Used PGBENCH Tool) Initialize cmd: “./pgbench -i -s 100 -d postgres" postgres=# \d+ pgbench_accounts Table "public.pgbench_accounts" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --+---+---+--+-+--+-+--+- aid | integer | | not null | | plain | | | bid | integer | | | | plain | | | abalance | integer | | | | plain | | | filler | character(84) | | | | extended | | | Indexes: "pgbench_accounts_idx" hash (aid, bid) Access method: heap Options: fillfactor=100 Test Command: “./pgbench -j 1 postgres -C -M prepared -S -T 300” Performance Test Results: Idea-1: Single Hash value for multiple key columns TPS = ~292 Idea-2: Separate Hash values for each key column. But use only the first one to search the bucket. Other hash values are used as payload to get to the matching tuple before going to the heap. TPS = ~212 Note: Here we got near to 25% better performance in a single combine hash approach with only TWO key columns. If we go for separate Hash values for all key columns mentioned then there will be a performance dip and storage also will be relatively higher when we have more key columns. I have just done separate POC patches to get the performance results as mentioned above, there are many other scenarios like split case, to be taken care further. Attaching the POC patches here just for reference… Thanks & Regards SadhuPrasad EnterpriseDB: http://www.enterprisedb.com Single-Hash_MultiKeyColumns.patch Description: Binary data Separate-Hash_MultiKeyColumns.patch Description: Binary data
Re: [PATCH] Tab completion for ALTER TABLE … ADD …
Michael Paquier writes: > On Tue, Aug 03, 2021 at 12:48:38PM +0100, Dagfinn Ilmari Mannsåker wrote: >> The other day I noticed that there's no tab completion after ALTER TABLE >> … ADD, so here's a patch. In addition to COLUMN and all the table >> constraint types, it also completes with the list of unique indexes on >> the table after ALTER TABLE … ADD … USING INDEX. > > I was reading this patch (not actually tested), and that's a clear > improvement. One extra thing that could be done here is to complete > with types for a ALTER TABLE ADD COLUMN foo. That was easy enough to add (just a bit of extra fiddling to handle COLUMN being optional), done in the attached v2 patch. > We could as well have a list of columns after UNIQUE or PRIMARY KEY, > but that feels like extra cream on top of the cake. Doing a list of arbitrarily many comma-separated names is more complicated, so that can be the subject for another patch. > In short I am fine with what you have here. Thanks for the review. - ilmari >From 0bdf91f2a66bf9393e6900db904bda1961c03350 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= Date: Tue, 3 Aug 2021 12:23:07 +0100 Subject: [PATCH v2] =?UTF-8?q?Add=20tab=20completion=20for=20ALTER=20TABLE?= =?UTF-8?q?=20=E2=80=A6=20ADD=20=E2=80=A6?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Complete with COLUMN plus the various table constraint types, list unique indexes on the table after ADD (UNQIUE|PRIMARY KEY) USING INDEX, and data types after ADD [COLUMN] . --- src/bin/psql/tab-complete.c | 49 + 1 file changed, 49 insertions(+) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 7cdfc7c637..bb7c3fc5cf 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -776,6 +776,10 @@ static const SchemaQuery Query_for_list_of_collations = { " and pg_catalog.quote_ident(c1.relname)='%s'"\ " and pg_catalog.pg_table_is_visible(c2.oid)" +#define Query_for_unique_index_of_table \ +Query_for_index_of_table \ +" and i.indisunique" + /* the silly-looking length condition is just to eat up the current word */ #define Query_for_constraint_of_table \ "SELECT pg_catalog.quote_ident(conname) "\ @@ -2019,6 +2023,51 @@ psql_completion(const char *text, int start, int end) "OWNER TO", "SET", "VALIDATE CONSTRAINT", "REPLICA IDENTITY", "ATTACH PARTITION", "DETACH PARTITION", "FORCE ROW LEVEL SECURITY"); + /* ALTER TABLE xxx ADD */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD")) + COMPLETE_WITH("COLUMN", "CONSTRAINT", "CHECK", "UNIQUE", "PRIMARY KEY", + "EXCLUDE", "FOREIGN KEY"); + /* ALTER TABLE xxx ADD CONSTRAINT yyy */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny)) + COMPLETE_WITH("CHECK", "UNIQUE", "PRIMARY KEY", "EXCLUDE", "FOREIGN KEY"); + /* ALTER TABLE xxx ADD (CONSTRAINT yyy)? FOREIGN|PRIMARY */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "FOREIGN|PRIMARY") || + Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny, "FOREIGN|PRIMARY")) + COMPLETE_WITH("KEY"); + /* ALTER TABLE xxx ADD (CONSTRAINT yyy)? (PRIMARY KEY|UNIQUE) */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "PRIMARY", "KEY") || + Matches("ALTER", "TABLE", MatchAny, "ADD", "UNIQUE") || + Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny, "PRIMARY", "KEY") || + Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny, "UNIQUE")) + COMPLETE_WITH("(", "USING INDEX"); + /* ALTER TABLE xxx ADD (CONSTRAINT yyy)? ... USING INDEX */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "PRIMARY", "KEY", "USING", "INDEX")) + { + completion_info_charp = prev6_wd; + COMPLETE_WITH_QUERY(Query_for_unique_index_of_table); + } + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "UNIQUE", "USING", "INDEX")) + { + completion_info_charp = prev5_wd; + COMPLETE_WITH_QUERY(Query_for_unique_index_of_table); + } + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny, + "PRIMARY", "KEY", "USING", "INDEX")) + { + completion_info_charp = prev8_wd; + COMPLETE_WITH_QUERY(Query_for_unique_index_of_table); + } + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "CONSTRAINT", MatchAny, + "UNIQUE", "USING", "INDEX")) + { + completion_info_charp = prev7_wd; + COMPLETE_WITH_QUERY(Query_for_unique_index_of_table); + } + /* ADD column_name must be last of the ALTER TABLE xxx ADD matches */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD", "COLUMN", MatchAny) || + (Matches("ALTER", "TABLE", MatchAny, "ADD", MatchAny) && + !Matches("ALTER", "TABLE", MatchAny, "ADD", "COLUMN"))) + COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_datatypes, NULL); /* ALTER TABLE xxx ENABLE */ else if (Matches("ALTER", "TABLE", MatchAny, "ENABLE")) COMPLETE_WITH("ALWAYS", "REPLICA", "ROW LEVEL SECURITY", "RULE", -- 2.30.2
Re: create table like: ACCESS METHOD
On 3/23/21 1:39 AM, Justin Pryzby wrote: > On Tue, Jan 19, 2021 at 03:03:31PM -0600, Justin Pryzby wrote: >> On Wed, Dec 30, 2020 at 12:33:56PM +, Simon Riggs wrote: >>> There are no tests for the new functionality, please could you add some? >> >> Did you look at the most recent patch? >> >> +CREATE ACCESS METHOD heapdup TYPE TABLE HANDLER heap_tableam_handler; >> +CREATE TABLE likeam() USING heapdup; >> +CREATE TABLE likeamlike(LIKE likeam INCLUDING ALL); >> >> It seems like this should error to me: CREATE ACCESS METHOD heapdup TYPE TABLE HANDLER heap_tableam_handler; CREATE TABLE likeam1() USING heap; CREATE TABLE likeam2() USING heapdup; CREATE TABLE likeamlike( LIKE likeam1 INCLUDING ACCESS METHOD, LIKE likeam2 INCLUDING ACCESS METHOD ); At the very least, the documentation should say that the last one wins. -- Vik Fearing
Re: perlcritic: prohibit map and grep in void conext
Michael Paquier writes: > On Wed, Jul 28, 2021 at 01:26:23PM +0200, Daniel Gustafsson wrote: >> I'm fine with increasing this policy, but I don't have strong feelings. If >> we >> feel the perlcritic policy change is too much, I would still prefer to go >> ahead >> with the map rewrite part of the patch though. > > I have no issue either about the rewrite part of the patch, so I'd > tend to just do this part and move on. Daniel, would you like to > apply that? Why the resistance to the perlcritic part? That one case is the only violation in the tree today, and it's a pattern we don't want to let back in (I will surely object every time I see it when reviewing patches), so why not automate it? - ilmari
Re: Mark all GUC variable as PGDLLIMPORT
On Fri, 27 Aug 2021 at 08:59, Julien Rouhaud wrote: > On Fri, Aug 27, 2021 at 3:42 AM Magnus Hagander > wrote: > > > > Yeah, but that does move the problem to the other side doesn't it? So > > if you (as a pure test of course) were to remove the variable > > completely from the included header and just declare it manually with > > PGDLLSPEC in your file, it should work? > > > > Ugly as it is, I wonder if there's a chance we could just process all > > the headers at install times and inject the PGDLLIMPORT. We know which > > symvols it is on account of what we're getting in the DEF file. > > > > Not saying that's not a very ugly solution, but it might work? > > It's apparently not enough. I tried with autovacuum_max_workers GUC, > and it still errors out. > If I add a PGDLLIMPORT, there's a link error when trying to access the > variable: > error LNK2019: unresolved external symbol __imp_autovacuum_max_workers > referenced in function... > > If I use PGDLLEXPORT I simply get: > error LNK2001: unresolved external symbol aytovacuum_max_workers > It works, but you can't use PGDLLIMPORT, you have to implement the import yourself without the help of __declspec(dllimport) . Where you want autovacuum_max_workers you must instead write *((int*)__imp_autovacuum_max_workers) Here's the comment I wrote on the topic in something I was working on: /* * On Windows, a symbol is not accessible outside the executable or shared * library (PE object) that it is defined in unless explicitly exported in * the DLL interface. * * It must then be explicitly imported by objects that use it; Windows doesn't * do ELF-style fixups. * * The export part is usually accomplished by a __declspec(dllexport) * annotation on the symbol or a .DEF file supplied as linker input. Postgres * uses the .DEF approach, auto-exporting all symbols using * src\tools\msvc\gendef.pl . Internally this hides "symname" from the DLL * interface and instead generates an export symbol "__imp_symname" that is a * pointer to the value of "symname". * * The import part is usually done with the __declspec(dllimport) annotation on * the symbol. The PGDLLIMPORT macro expands to __declspec(dllimport) when * postgres headers are included during extension compilation. But not all the * symbols that pglogical needs are annotated with PGDLLIMPORT. Attempting to * access a symbol that is not so-annotated will fail at link time with an * error like * * error LNK2001: unresolved external symbol criticalSharedRelcachesBuilt * * Because of gendefs.pl, postgres still exports the symbol even if it isn't * annotated PGDLLIMPORT. So we can just do the shorthand that * __declspec(dllimport) does for us in the preprocessor instead: replace each * symbol with its __imp_symbol indirection and dereference it. * * There's one wrinkle in this though. MSVC doesn't generate a definition for a * global data symbol that is neither initialized nor explicitly marked * __declspec(dllexport). gendefs.pl will think such symbols are references to * a symbol defined in another object file and will skip them without emitting * a DATA entry for them in the DEF file, so no __imp_ stub is generated in the * DLL interface. We can't use (*__imp_symbolname) if there's no import stub. * * If they're GUCs, we can round-trip them through their text values * to read them. Nothing should ever be assigning to GUC storage and there's no * reason to take the address of GUC storage, so this should work fine, albeit * slower. If we find any that aren't GUCs we're in trouble but so far there * haven't been any. * * See also: * * - https://docs.microsoft.com/en-us/cpp/build/importing-data-using-declspec-dllimport * - https://docs.microsoft.com/en-us/cpp/build/importing-using-def-files * - https://docs.microsoft.com/en-us/cpp/build/exporting-from-a-dll-using-def-files * - https://docs.microsoft.com/en-us/cpp/build/determining-which-exporting-method-to-use */ This is gruesome and I hadn't planned to mention it, but now someone noticed the .DEF file exports the symbols I guess it does no harm. So can we just fix PGDLLIMPORT now?
Re: Mark all GUC variable as PGDLLIMPORT
On Thu, 26 Aug 2021 at 01:51, Alvaro Herrera wrote: > On 2021-Aug-25, Magnus Hagander wrote: > > > The thing we need the PGDLLIMPORT definition for is to *import* them > > on the other end? > > Oh ... so modules that are willing to cheat can include their own > declarations of the variables they need, and mark them __declspec > (dllimport)? > > Damn. I was hoping nobody would notice that. I do exactly that in some extensions to work around some of this mess, but it is quite awkward and has its limitations.
Re: Mark all GUC variable as PGDLLIMPORT
On Wed, 25 Aug 2021 at 22:36, Magnus Hagander wrote: > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas wrote: > > > > On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack > wrote: > > > The thing is, I think I have somewhere a list of all the threads on > this > > > topic that I've read through since the first time I had to come with > my own > > > hat in hand asking for a PGDLLIMPORT on something, years ago now, and > > > I don't think I have ever seen one where it was as uncontroversial > > > as you suggest. > > > > It does tend to be controversial, but I think that's basically only > > because Tom Lane has reservations about it. I think if Tom dropped his > > opposition to this, nobody else would really care. And I think that > > would be a good thing for the project. > > > I have only one consideration about it, and that's a technical one :) > > Does this in some way have an effect on the size of the binary and/or > the time it takes to load it? > On *nix, no. On Windows, very, very minimally. We *should* be looking into making private symbols we can't make non-static have hidden visibility at link time, i.e. be DSO-private. This can have a huge impact on link-time optimisation and inlining. But doing so is quite orthogonal to the matter of fixing a linkage issue on Windows. By making select symbols hidden we'd be *reducing* the exposed set of functions and data symbols in a disciplined and progressive way on all platforms. Useful but different.
Re: Mark all GUC variable as PGDLLIMPORT
On Wed, 25 Aug 2021 at 03:13, Robert Haas wrote: > On Tue, Aug 24, 2021 at 2:52 PM Chapman Flack > wrote: > > I don't think that's true of the second proposal in [0]. I don't foresee > > a noticeable runtime cost unless there is a plausible workload that > > involves very frequent updates to GUC settings that are also of interest > > to a bunch of extensions. Maybe I'll take a stab at a POC. > > I'm not sure I fully understand that proposal, but I find it hard to > believe that we would seriously consider replacing every direct GUC > reference in the backend with something that goes through an API. Even > if didn't hurt performance, I think it would uglify the code a whole > lot. > It'd probably have to be something that resolves the GUC storage addresses at compile-time or once at runtime, if it's going to be used by core code. While some code doesn't hit a lot of GUCs, some *really* hammers some common GUCs. There are various issues with cache lines and pointer chasing that are beyond my low-level fu at work here. Adding a level of pointer indirection can be very expensive in the wrong situations. So you're probably looking at some kind of mess with token pasting, macros and static inlines. Ew.
Re: Partition Check not updated when insert into a partition
On Thu, Aug 5, 2021 at 11:32 AM Amit Langote wrote: > On Wed, Jul 14, 2021 at 11:16 AM houzj.f...@fujitsu.com > > On Tuesday, July 13, 2021 2:52 AM Alvaro Herrera > > wrote: > > > Did you have a misbehaving test for the ATTACH case? > > > > Thanks for the response. > > Thank you both. > > > Yes, I think the following example of ATTACH doesn't work as expected. > > Yeah, need the fix for the ATTACH case too. > > Couple more things: > > * We must invalidate not just the "direct" partitions of the table > being attached/detached, but also any indirect ones, because all of > their partition constraints would need to contain (or no longer > contain) the root parent's partition constraint. > > * I think we should lock the partitions before sending the > invalidation. The ATTACH code already locks the descendents for a > different purpose, but DETACH doesn't, so the latter needs to be fixed > to match. > > I've updated Alvaro's patch to address these points. Maybe, we should > also add these cases to the regression and isolation suites? Apparently, I had posted a version of the patch that didn't even compile. I have fixed that in the attached and also added regression tests. Adding this to the next CF. -- Amit Langote EDB: http://www.enterprisedb.com v3-0001-Invalidate-partitions-of-table-being-attached-det.patch Description: Binary data
Re: Multi-Column List Partitioning
> + * isnulls is an array of boolean-tuples with key->partnatts booleans values > + * each. Currently only used for list partitioning, it stores whether a > > I think 'booleans' should be 'boolean'. > The trailing word 'each' is unnecessary. > bq. Supported new syantx to allow mentioning multiple key information. > > syantx -> syntax > + isDuplicate = checkForDuplicates(result, values); > + if (isDuplicate) > + continue; > > It seems the variable isDuplicate is not needed. The if statement can > directly check the return value from checkForDuplicates(). I agree that isDuplicate is not required. Thanks for sharing the comments. I will take care of these comments in the next patch. > + //TODO: Handle for multi-column cases > + for (j = 0; j < 1; j++) > > Is this part going to be updated in the next patch? Yes. The code changes related to partition-wise join are in progress. I will handle these in the next patch. Thanks & Regards, Nitin Jadhav On Thu, Aug 26, 2021 at 2:40 AM Zhihong Yu wrote: > > > > On Wed, Aug 25, 2021 at 5:41 AM Nitin Jadhav > wrote: >> >> > The new list bound binary search and related comparison support >> > function look a bit too verbose to me. I was expecting >> > partition_list_bsearch() to look very much like >> > partition_range_datum_bsearch(), but that is not the case. The >> > special case code that you wrote in partition_list_bsearch() seems >> > unnecessary, at least in that function. I'm talking about the code >> > fragment starting with this comment: >> > >> > I will look at other parts of the patch next week hopefully. For >> > now, attached is a delta patch that applies on top of your v1, which >> > does: >> > >> > * Simplify partition_list_bsearch() and partition_lbound_datum_cmp() >> > * Make qsort_partition_list_value_cmp simply call >> > partition_lbound_datum_cmp() instead of having its own logic to >> > compare input bounds >> > * Move partition_lbound_datum_cmp() into partbounds.c as a static >> > function (export seems unnecessary) >> > * Add a comment for PartitionBoundInfo.isnulls and remove that for >> > null_index >> >> Yes. You are right. The extra code added in partition_list_bsearch() >> is not required and thanks for sharing the delta patch. It looks good >> to me and I have incorporated the changes in the attached patch. >> >> > I guess you're perhaps trying to address the case where the caller >> > does not specify the values for all of the partition key columns, >> > which can happen when the partition pruning code needs to handle a set >> > of clauses matching only some of the partition key columns. But >> > that's a concern of the partition pruning code and so the special case >> > should be handled there (if at all), not in the binary search function >> > that is shared with other callers. Regarding that, I'm wondering if >> > we should require clauses matching all of the partition key columns to >> > be found for the pruning code to call the binary search, so do >> > something like get_matching_hash_bounds() does: >> > >> > Do you think that trying to match list partitions even with fewer keys >> > is worth the complexity of the implementation? That is, is the use >> > case to search for only a subset of partition key columns common >> > enough with list partitioning? >> > >> > If we do decide to implement the special case, remember that to do >> > that efficiently, we'd need to require that the subset of matched key >> > columns constitutes a prefix, because of the way the datums are >> > sorted. That is, match all partitions when the query only contains a >> > clause for b when the partition key is (a, b, c), but engage the >> > special case of pruning if the query contains clauses for a, or for a >> > and b. >> >> Thanks for the suggestion. Below is the implementation details for the >> partition pruning for multi column list partitioning. >> >> In the existing code (For single column list partitioning) >> 1. In gen_partprune_steps_internal(), we try to match the where >> clauses provided by the user with the partition key data using >> match_clause_to_partition_key(). Based on the match, this function can >> return many values like PARTCLAUSE_MATCH_CLAUSE, >> PARTCLAUSE_MATCH_NULLNESS, PARTCLAUSE_NOMATCH, etc. >> 2. In case of PARTCLAUSE_MATCH_CLAUSE, we generate steps using >> gen_prune_steps_from_opexps() (strategy-2) which generate and return a >> list of PartitionPruneStepOp that are based on OpExpr and BooleanTest >> clauses that have been matched to the partition key and it also takes >> care handling prefix of the partition keys. >> 3. In case of PARTCLAUSE_MATCH_NULLNESS, we generate steps using >> gen_prune_step_op() (strategy-1) which generates single >> PartitionPruneStepOp since the earlier list partitioning supports >> single column and there can be only one NULL value. In >> get_matching_list_bounds(), if the nullkeys is not empty, we fetch the >> partition index which accepts n
Re: Error code for checksum failure in origin.c
> On 27 Aug 2021, at 06:32, Amit Kapila wrote: > I think we need to backpatch this till 9.6 as this is introduced by > commit 5aa2350426. Any objections to that? No, that seems appropriate. -- Daniel Gustafsson https://vmware.com/
Re: perlcritic: prohibit map and grep in void conext
> On 27 Aug 2021, at 08:10, Michael Paquier wrote: > > On Wed, Jul 28, 2021 at 01:26:23PM +0200, Daniel Gustafsson wrote: >> I'm fine with increasing this policy, but I don't have strong feelings. If >> we >> feel the perlcritic policy change is too much, I would still prefer to go >> ahead >> with the map rewrite part of the patch though. > > I have no issue either about the rewrite part of the patch, so I'd > tend to just do this part and move on. Daniel, would you like to > apply that? Sure, I can take care of that. -- Daniel Gustafsson https://vmware.com/
Re: Supply restore_command to pg_rewind via CLI argument
> 29 июня 2021 г., в 19:34, Alexey Kondratov > написал(а): > > On Fri, Jun 18, 2021 at 10:06 PM Alexey Kondratov > wrote: >> On Fri, Jun 18, 2021 at 5:42 PM Andrey Borodin wrote: >>> >>> If we run 'pg_rewind --restore-target-wal' there must be restore_command in >>> config of target installation. But if the config is not within >>> $PGDATA\postgresql.conf pg_rewind cannot use it. >>> If we run postmaster with `-c >>> config_file=/etc/postgresql/10/data/postgresql.conf`, we simply cannot use >>> the feature. We solved the problem by putting config into PGDATA only >>> during pg_rewind, but this does not seem like a very robust solution. >>> >> >> Yeah, Michael was against it, while we had no good arguments, so >> Alexander removed it, IIRC. This example sounds reasonable to me. I >> also recall some complaints from PostgresPro support folks, that it is >> sad to not have a cli option to pass restore_command. However, I just >> thought about another recent feature --- ensure clean shutdown, which >> is turned on by default. So you usually run Postgres with one config, >> but pg_rewind may start it with another, although in single-user mode. >> Is it fine for you? We rewind failovered node, so clean shutdown was not performed. But I do not see how it could help anyway. To pass restore command we had to setup new config in PGDATA configured as standby, because either way we cannot set restore_command there. >>> Maybe we could add "-C, --target-restore-command=COMMAND target WAL >>> restore_command\n" as was proposed within earlier versions of the patch[0]? >>> Or instruct pg_rewind to pass config to 'postgres -C restore_command' run? >> >> Hm, adding --target-restore-command is the simplest way, sure, but >> forwarding something like '-c config_file=...' to postgres sounds >> interesting too. Could it have any use case beside providing a >> restore_command? I cannot imagine anything right now, but if any >> exist, then it could be a more universal approach. I think --target-restore-command is the best solution right now. >>> From my POV adding --target-restore-command is simplest way, I can extract >>> corresponding portions from original patch. >>> >> >> I will have a look, maybe I even already have this patch separately. I >> remember that we were considering adding this option to PostgresPro, >> when we did a backport of this feature. >> > > Here it is. I have slightly adapted the previous patch to the recent > pg_rewind changes. In this version -C does not conflict with -c, it > just overrides it. Great, thanks! There is a small bug + /* +* Take restore_command from the postgresql.conf only if it is not already +* provided as a command line option. +*/ + if (!restore_wal && restore_command == NULL) return; I think we should use condition (!restore_wal || restore_command != NULL). Besides this patch looks good and is ready for committer IMV. Thanks! Best regards, Andrey Borodin.