Re: Support logical replication of DDLs
On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila wrote: > > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar wrote: > > > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila wrote: > > > > > > > The *initial* DDL replication is a different problem than DDL > > > > replication. The > > > > former requires a snapshot to read the current catalog data and build a > > > > CREATE > > > > command as part of the subscription process. The subsequent DDLs in > > > > that object > > > > will be handled by a different approach that is being discussed here. > > > > > > > > > > I think they are not completely independent because of the current way > > > to do initial sync followed by replication. The initial sync and > > > replication need some mechanism to ensure that one of those doesn't > > > overwrite the work done by the other. Now, the initial idea and patch > > > can be developed separately but I think both the patches have some > > > dependency. > > > > I agree with the point that their design can not be completely > > independent. They have some logical relationship of what schema will > > be copied by the initial sync and where is the exact boundary from > > which we will start sending as replication. And suppose first we only > > plan to implement the replication part then how the user will know > > what all schema user has to create and what will be replicated using > > DDL replication? Suppose the user takes a dump and copies all the > > schema and then creates the subscription, then how we are we going to > > handle the DDL concurrent to the subscription command? > > > > Right, I also don't see how it can be done in the current > implementation. So, I think even if we want to develop these two as > separate patches they need to be integrated to make the solution > complete. It would be better to develop them separately in terms of development speed but, yes, we perhaps need to integrate them at some points. I think that the initial DDL replication can be done when the relation's state is SUBREL_STATE_INIT. That is, at the very beginning of the table synchronization, the syncworker copies the table schema somehow, then starts the initial data copy. After that, syncworker or applyworker applies DML/DDL changes while catching up and streaming changes, respectively. Probably we can have it optional whether to copy schema only, data only, or both. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Mark all GUC variable as PGDLLIMPORT
On Mon, May 09, 2022 at 09:23:47AM -0400, Robert Haas wrote: > Either of you please feel free to change these things, at least as far > as I'm concerned. Well, what about the attached then? While looking at all the headers in the tree, I have noticed that __pg_log_level is not marked, but we'd better paint a PGDLLIMPORT also for it? This is getting used by pgbench for some unlikely() business, as one example. -- Michael diff --git a/src/include/common/logging.h b/src/include/common/logging.h index 9f426c32d6..2ab9f0ea50 100644 --- a/src/include/common/logging.h +++ b/src/include/common/logging.h @@ -51,7 +51,7 @@ enum pg_log_level /* * __pg_log_level is the minimum log level that will actually be shown. */ -extern enum pg_log_level __pg_log_level; +extern PGDLLIMPORT enum pg_log_level __pg_log_level; /* * A log message can have several parts. The primary message is required, diff --git a/src/include/libpq/pqsignal.h b/src/include/libpq/pqsignal.h index 016fc89bd9..41227a30e2 100644 --- a/src/include/libpq/pqsignal.h +++ b/src/include/libpq/pqsignal.h @@ -30,9 +30,9 @@ extern int pqsigsetmask(int mask); #define sigdelset(set, signum) (*(set) &= ~(sigmask(signum))) #endif /* WIN32 */ -extern sigset_t UnBlockSig, - BlockSig, - StartupBlockSig; +extern PGDLLIMPORT sigset_t UnBlockSig; +extern PGDLLIMPORT sigset_t BlockSig; +extern PGDLLIMPORT sigset_t StartupBlockSig; extern void pqinitmask(void); diff --git a/src/tools/mark_pgdllimport.pl b/src/tools/mark_pgdllimport.pl index 83b90db6ef..834fcac5b5 100755 --- a/src/tools/mark_pgdllimport.pl +++ b/src/tools/mark_pgdllimport.pl @@ -12,7 +12,8 @@ # smart and may not catch all cases. # # It's probably a good idea to run pgindent on any files that this -# script modifies before committing. +# script modifies before committing. This script uses as arguments +# a list of the header files to scan for the markings. # # Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California signature.asc Description: PGP signature
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Mon, May 9, 2022 at 4:39 PM Andrey Borodin wrote: > > On 9 May 2022, at 14:44, Dilip Kumar wrote: > > > > IMHO, making it wait for some amount of time, based on GUC is not a > > complete solution. It is just a hack to avoid the problem in some > > cases. > > Disallowing cancelation of locally committed transactions is not a hack. It's > removing of a hack that was erroneously installed to make backend responsible > to Ctrl+C (or client side statement timeout). I might be missing something but based on my understanding the approach is not disallowing the query cancellation but it is just adding the configuration for how much to delay before canceling the query. That's the reason I mentioned that this is not a guarenteed solution. I mean with this configuration value also you can not avoid problems in all the cases, right? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Tue, May 10, 2022 at 1:18 PM Dilip Kumar wrote: > > On Mon, May 9, 2022 at 4:39 PM Andrey Borodin wrote: > > > > On 9 May 2022, at 14:44, Dilip Kumar wrote: > > > > > > IMHO, making it wait for some amount of time, based on GUC is not a > > > complete solution. It is just a hack to avoid the problem in some > > > cases. > > > > Disallowing cancelation of locally committed transactions is not a hack. > > It's removing of a hack that was erroneously installed to make backend > > responsible to Ctrl+C (or client side statement timeout). > > I might be missing something but based on my understanding the > approach is not disallowing the query cancellation but it is just > adding the configuration for how much to delay before canceling the > query. That's the reason I mentioned that this is not a guarenteed > solution. I mean with this configuration value also you can not avoid > problems in all the cases, right? Yes Dilip, the proposed GUC in v1 patch doesn't allow waiting forever for sync repl ack, in other words, doesn't allow blocking the pending query cancels or proc die interrupts forever. The backends may linger in case repl ack isn't received or sync replicas aren't reachable? Users may have to set the GUC to a 'reasonable value'. If okay, I can make the GUC behave this way - value 0 existing behaviour i.e. no wait for sync repl ack, just process query cancels and proc die interrupts immediately; value -1 wait unboundedly for the ack; value > 0 wait for specified milliseconds for the ack. Regards, Bharath Rupireddy.
Re: Privileges on PUBLICATION
Euler Taveira wrote: > On Mon, May 9, 2022, at 11:09 AM, Antonin Houska wrote: > > Now that the user can specify rows and columns to be omitted from the logical > replication [1], I suppose hiding rows and columns from the subscriber is an > important use case. However, since the subscription connection user (i.e. the > user specified in the CREATE SUBSCRIPTION ... CONNECTION ... command) needs > SELECT permission on the replicated table (on the publication side), he can > just use another publication (which has different filters or no filters at > all) to get the supposedly-hidden data replicated. > > The required privileges were not relaxed on publisher after the row filter > and > column list features. It is not just to "create another publication". Create > publications require CREATE privilege on databases (that is *not* granted to > PUBLIC).If you have an untrusted user that could bypass your rules about > hidden > data, it is better to review your user privileges. > > postgres=# CREATE ROLE foo REPLICATION LOGIN; > CREATE ROLE > postgres=# \c - foo > You are now connected to database "postgres" as user "foo". > postgres=> CREATE PUBLICATION pub1; > ERROR: permission denied for database postgres > > The documentation [1] says > > "The role used for the replication connection must have the REPLICATION > attribute (or be a superuser)." > > You can use role foo for the replication connection but role foo couldn't be a > superuser. In this case, even if role foo open a connection to database > postgres, a publication cannot be created due to lack of privileges. > > Don't we need privileges on publication (e.g GRANT USAGE ON PUBLICATION ...) > now? > > Maybe. We rely on CREATE privilege on databases right now. If you say that > GRANT USAGE ON PUBLICATION is just a command that will have the same effect as > REPLICATION property [1] has right now, I would say it won't. Are you aiming a > fine-grained access control on publisher? The configuration I'm thinking of is multiple replicas reading data from the same master. For example, consider "foo" and "bar" roles, used by "subscr_foo" and "subscr_bar" subscriptions respectively. (Therefore, both roles need the REPLICATION option.) The subscriptions "subscr_foo" and "subscr_bar" are located in "db_foo" and "db_bar" databases respectively. On the master side, there are two publications: "pub_foo" and "pub_bar", to be used by "subscr_foo" and "subscr_bar" subscriptions respectively. The publications replicate the same table, but each with a different row filter. The problem is that the admin of "db_foo" can add the "pub_bar" publication to the "subscr_foo" subscription, and thus get the data that his "pub_foo" would filter out. Likewise, the admin of "db_bar" can "steal" the data from "pub_foo" by adding that publication to "subscr_bar". In this case, the existing publications are misused, so the CREATE PUBLICATION privileges do not help. Since the REPLICATION option of a role is cluster-wide, but I need specific roles to be restricted to specific publications, it can actually be called fine-grained access control as you say. > [1] https://www.postgresql.org/docs/devel/logical-replication-security.html -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: Privileges on PUBLICATION
Amit Kapila wrote: > On Tue, May 10, 2022 at 12:16 AM Euler Taveira wrote: > > > > On Mon, May 9, 2022, at 11:09 AM, Antonin Houska wrote: > > > > Now that the user can specify rows and columns to be omitted from the > > logical > > replication [1], I suppose hiding rows and columns from the subscriber is an > > important use case. However, since the subscription connection user (i.e. > > the > > user specified in the CREATE SUBSCRIPTION ... CONNECTION ... command) needs > > SELECT permission on the replicated table (on the publication side), he can > > just use another publication (which has different filters or no filters at > > all) to get the supposedly-hidden data replicated. > > > > The required privileges were not relaxed on publisher after the row filter > > and > > column list features. It is not just to "create another publication". Create > > publications require CREATE privilege on databases (that is *not* granted to > > PUBLIC).If you have an untrusted user that could bypass your rules about > > hidden > > data, it is better to review your user privileges. > > > > Also, to create a subscription (which combines multiple publications > to bypass rules), a user must be a superuser. So, isn't that a > sufficient guarantee that users shouldn't be able to bypass such > rules? My understanding is that the rows/columns filtering is a way for the *publisher* to control which data is available to particular replica. From this point of view, the publication privileges would just make the control complete. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: make MaxBackends available in _PG_init
On Fri, May 06, 2022 at 07:51:43PM -0400, Robert Haas wrote: > On Fri, May 6, 2022 at 5:15 PM Nathan Bossart > wrote: >> On Fri, May 06, 2022 at 08:27:11AM -0700, Nathan Bossart wrote: >> > +1, I'll put together a new patch set. >> >> As promised... > > Looks reasonable to me. 0001 looks sensible seen from here with this approach. > Anyone else have thoughts? I agree that removing support for the unloading part would be nice to clean up now on HEAD. Note that 0002 is missing the removal of one reference to _PG_fini in xfunc.sgml ( markup). -- Michael signature.asc Description: PGP signature
Re: Perform streaming logical transactions by background workers and parallel apply
On Tue, May 10, 2022 at 10:39 AM Masahiko Sawada wrote: > > On Wed, May 4, 2022 at 8:44 AM Peter Smith wrote: > > > > On Tue, May 3, 2022 at 5:16 PM Peter Smith wrote: > > > > > ... > > > > > Avoiding unexpected differences like this is why I suggested the > > > option should have to be explicitly enabled instead of being on by > > > default as it is in the current patch. See my review comment #14 [1]. > > > It means the user won't have to change their existing code as a > > > workaround. > > > > > > -- > > > [1] > > > https://www.postgresql.org/message-id/CAHut%2BPuqYP5eD5wcSCtk%3Da6KuMjat2UCzqyGoE7sieCaBsVskQ%40mail.gmail.com > > > > > > > Sorry I was wrong above. It seems this behaviour was already changed > > in the latest patch v5 so now the option value 'on' means what it > > always did. Thanks! > > Having it optional seems a good idea. BTW can the user configure how > many apply bgworkers can be used per subscription or in the whole > system? Like max_sync_workers_per_subscription, is it better to have a > configuration parameter or a subscription option for that? If so, > setting it to 0 probably means to disable the parallel apply feature. > Yeah, that might be useful but we are already giving an option while creating a subscription whether to allow parallelism, so will it be useful to give one more way to disable this feature? OTOH, having something like max_parallel_apply_workers/max_bg_apply_workers at the system level can give better control for how much parallelism the user wishes to allow for apply work. If we have such a new parameter then I think max_logical_replication_workers should include apply workers, parallel apply workers, and table synchronization? In such a case, don't we need to think of increasing the default value of max_logical_replication_workers? -- With Regards, Amit Kapila.
Re: Perform streaming logical transactions by background workers and parallel apply
On Tue, May 10, 2022 at 10:35 AM Masahiko Sawada wrote: > > On Wed, May 4, 2022 at 12:50 PM Amit Kapila wrote: > > > > On Tue, May 3, 2022 at 9:45 AM Amit Kapila wrote: > > > > > > On Mon, May 2, 2022 at 5:06 PM Masahiko Sawada > > > wrote: > > > > > > > > On Mon, May 2, 2022 at 6:09 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada > > > > > wrote: > > > > > > > > > > > > > > > > > > Are you planning to support "Transaction dependency" Amit mentioned > > > > > > in > > > > > > his first mail in this patch? IIUC since the background apply worker > > > > > > applies the streamed changes as soon as receiving them from the main > > > > > > apply worker, a conflict that doesn't happen in the current > > > > > > streaming > > > > > > logical replication could happen. > > > > > > > > > > > > > > > > This patch seems to be waiting for stream_stop to finish, so I don't > > > > > see how the issues related to "Transaction dependency" can arise? What > > > > > type of conflict/issues you have in mind? > > > > > > > > Suppose we set both publisher and subscriber: > > > > > > > > On publisher: > > > > create table test (i int); > > > > insert into test values (0); > > > > create publication test_pub for table test; > > > > > > > > On subscriber: > > > > create table test (i int primary key); > > > > create subscription test_sub connection '...' publication test_pub; -- > > > > value 0 is replicated via initial sync > > > > > > > > Now, both 'test' tables have value 0. > > > > > > > > And suppose two concurrent transactions are executed on the publisher > > > > in following order: > > > > > > > > TX-1: > > > > begin; > > > > insert into test select generate_series(0, 1); -- changes will be > > > > streamed; > > > > > > > > TX-2: > > > > begin; > > > > delete from test where c = 0; > > > > commit; > > > > > > > > TX-1: > > > > commit; > > > > > > > > With the current streaming logical replication, these changes will be > > > > applied successfully since the deletion is applied before the > > > > (streamed) insertion. Whereas with the apply bgworker, it fails due to > > > > an unique constraint violation since the insertion is applied first. > > > > I've confirmed that it happens with v5 patch. > > > > > > > > > > Good point but I am not completely sure if doing transaction > > > dependency tracking for such cases is really worth it. I feel for such > > > concurrent cases users can anyway now also get conflicts, it is just a > > > matter of timing. One more thing to check transaction dependency, we > > > might need to spill the data for streaming transactions in which case > > > we might lose all the benefits of doing it via a background worker. Do > > > we see any simple way to avoid this? > > > > > I agree that it is just a matter of timing. I think new issues that > haven't happened on the current streaming logical replication > depending on the timing could happen with this feature and vice versa. > Here by vice versa, do you mean some problems that can happen with current code won't happen after new implementation? If so, can you give one such example? > > > > I think the other kind of problem that can happen here is delete > > followed by an insert. If in the example provided by you, TX-1 > > performs delete (say it is large enough to cause streaming) and TX-2 > > performs insert then I think it will block the apply worker because > > insert will start waiting infinitely. Currently, I think it will lead > > to conflict due to insert but that is still solvable by allowing users > > to remove conflicting rows. > > > > It seems both these problems are due to the reason that the table on > > publisher and subscriber has different constraints otherwise, we would > > have seen the same behavior on the publisher as well. > > > > There could be a few ways to avoid these and similar problems: > > a. detect the difference in constraints between publisher and > > subscribers like primary key and probably others (like whether there > > is any volatile function present in index expression) when applying > > the change and then we give ERROR to the user that she must change the > > streaming mode to 'spill' instead of 'apply' (aka parallel apply). > > b. Same as (a) but instead of ERROR just LOG this information and > > change the mode to spill for the transactions that operate on that > > particular relation. > > Given that it doesn't introduce a new kind of problem I don't think we > need special treatment for that at least in this feature. > Isn't the problem related to infinite wait by insert as explained in my previous email (in the above-quoted text) a new kind of problem that won't exist in the current implementation? -- With Regards, Amit Kapila.
Allowing REINDEX to have an optional name
A minor issue, and patch. REINDEX DATABASE currently requires you to write REINDEX DATABASE dbname, which makes this a little less usable than we might like. REINDEX on the catalog can cause deadlocks, which also makes REINDEX DATABASE not much use in practice, and is the reason there is no test for REINDEX DATABASE. Another reason why it is a little less usable than we might like. Seems we should do something about these historic issues in the name of product usability. Attached patch allows new syntax for REINDEX DATABASE, without needing to specify dbname. That version of the command skips catalog tables, as a way of avoiding the known deadlocks. Patch also adds a test. -- Simon Riggshttp://www.EnterpriseDB.com/ reindex_not_require_database_name.v2.patch Description: Binary data
Re: Support logical replication of DDLs
On Tue, May 10, 2022 at 12:32 PM Masahiko Sawada wrote: > > On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila wrote: > > > > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar wrote: > > > > > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila > > > wrote: > > > > > > > > > The *initial* DDL replication is a different problem than DDL > > > > > replication. The > > > > > former requires a snapshot to read the current catalog data and build > > > > > a CREATE > > > > > command as part of the subscription process. The subsequent DDLs in > > > > > that object > > > > > will be handled by a different approach that is being discussed here. > > > > > > > > > > > > > I think they are not completely independent because of the current way > > > > to do initial sync followed by replication. The initial sync and > > > > replication need some mechanism to ensure that one of those doesn't > > > > overwrite the work done by the other. Now, the initial idea and patch > > > > can be developed separately but I think both the patches have some > > > > dependency. > > > > > > I agree with the point that their design can not be completely > > > independent. They have some logical relationship of what schema will > > > be copied by the initial sync and where is the exact boundary from > > > which we will start sending as replication. And suppose first we only > > > plan to implement the replication part then how the user will know > > > what all schema user has to create and what will be replicated using > > > DDL replication? Suppose the user takes a dump and copies all the > > > schema and then creates the subscription, then how we are we going to > > > handle the DDL concurrent to the subscription command? > > > > > > > Right, I also don't see how it can be done in the current > > implementation. So, I think even if we want to develop these two as > > separate patches they need to be integrated to make the solution > > complete. > > It would be better to develop them separately in terms of development > speed but, yes, we perhaps need to integrate them at some points. > > I think that the initial DDL replication can be done when the > relation's state is SUBREL_STATE_INIT. That is, at the very beginning > of the table synchronization, the syncworker copies the table schema > somehow, then starts the initial data copy. After that, syncworker or > applyworker applies DML/DDL changes while catching up and streaming > changes, respectively. Probably we can have it optional whether to > copy schema only, data only, or both. > This sounds okay for copying table schema but we can have other objects like functions, procedures, views, etc. So, we may need altogether a separate mechanism to copy all the published objects. -- With Regards, Amit Kapila.
Re: Hash index build performance tweak from sorting
On Sat, 30 Apr 2022 at 12:12, Amit Kapila wrote: > > Few comments on the patch: > 1. I think it is better to use DatumGetUInt32 to fetch the hash key as > the nearby code is using. > 2. You may want to change the below comment in HSpool > /* > * We sort the hash keys based on the buckets they belong to. Below masks > * are used in _hash_hashkey2bucket to determine the bucket of given hash > * key. > */ Addressed in new patch, v2. On Wed, 4 May 2022 at 11:27, Amit Kapila wrote: > > So, we should go with this unless someone else sees any flaw here. Cool, thanks. -- Simon Riggshttp://www.EnterpriseDB.com/ hash_sort_by_hash.v2.patch Description: Binary data
Re: [PoC] Improve dead tuple storage for lazy vacuum
On Tue, May 10, 2022 at 8:52 AM Masahiko Sawada wrote: > > Overall, radix tree implementations have good numbers. Once we got an > agreement on moving in this direction, I'll start a new thread for > that and move the implementation further; there are many things to do > and discuss: deletion, API design, SIMD support, more tests etc. +1 (FWIW, I think the current thread is still fine.) -- John Naylor EDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Fri, May 6, 2022 at 11:24 PM Amit Kapila wrote: > As we have hacked CreatePublication function for this POC, the > regression tests are not passing but we can easily change it so that > we invoke new functionality with the syntax proposed in this thread or > with some other syntax and we shall do that in the next patch unless > this approach is not worth pursuing. > > This POC is prepared by Ajin Cherian, Hou-San, and me. > > Thoughts? > > [1] - > https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org I have updated Amit's patch by including a public action "create" when creating publication which is turned off by default. Now the 'make check' tests pass. I also fixed a problem that failed to create tables when the table has a primary key. regards, Ajin Cherian Fujitsu Australia v2-0001-Fix-race-in-032_relfilenode_reuse.pl.patch Description: Binary data v2-0002-Functions-to-deparse-DDL-commands.patch Description: Binary data
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
> 10 мая 2022 г., в 12:59, Bharath Rupireddy > написал(а): > > If okay, I can make the GUC behave this way - value 0 existing > behaviour i.e. no wait for sync repl ack, just process query cancels > and proc die interrupts immediately; value -1 wait unboundedly for the > ack; value > 0 wait for specified milliseconds for the ack. +1 if we make -1 and 0 only valid values. > query cancels or proc die interrupts Please note, that typical HA tool would need to handle query cancels and proc die interrupts differently. When the network is partitioned and somewhere standby is promoted you definitely want infinite wait for cancels. Yet once upon a time you want to shutdown postgres without coredump - thus proc die needs to be processed. Thanks! Best regards, Andrey Borodin.
Re: Allowing REINDEX to have an optional name
Hi, Am Dienstag, dem 10.05.2022 um 10:13 +0100 schrieb Simon Riggs: > A minor issue, and patch. > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > dbname, which makes this a little less usable than we might like. > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > DATABASE not much use in practice, and is the reason there is no test > for REINDEX DATABASE. Another reason why it is a little less usable > than we might like. > > Seems we should do something about these historic issues in the name > of product usability. > Wow, i just recently had a look into that code and talked with my colleagues on how the current behavior annoyed me last weekand there you are! This community rocks ;) > Attached patch allows new syntax for REINDEX DATABASE, without > needing > to specify dbname. That version of the command skips catalog tables, > as a way of avoiding the known deadlocks. Patch also adds a test. > + /* Unqualified REINDEX DATABASE will skip catalog tables */ + if (objectKind == REINDEX_OBJECT_DATABASE && + objectName == NULL && + IsSystemClass(relid, classtuple)) + continue; Hmm, shouldn't we just print a NOTICE or something like this in addition to this check to tell the user that we are *not* really reindexing all things (and probably give him a hint to use REINDEX SYSTEM to cover them)? Thanks, Bernd
Re: Support logical replication of DDLs
On Tue, May 10, 2022 at 4:59 PM Ajin Cherian wrote: > > On Fri, May 6, 2022 at 11:24 PM Amit Kapila wrote: > > As we have hacked CreatePublication function for this POC, the > > regression tests are not passing but we can easily change it so that > > we invoke new functionality with the syntax proposed in this thread or > > with some other syntax and we shall do that in the next patch unless > > this approach is not worth pursuing. > > > > This POC is prepared by Ajin Cherian, Hou-San, and me. > > > > Thoughts? > > > > [1] - > > https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org > > I have updated Amit's patch by including a public action "create" when > creating publication which is turned off by default. > Now the 'make check' tests pass. I also fixed a problem that failed to > create tables when the table has a primary key. > Are these the correct set of patches? I don't see a DDL support patch. -- With Regards, Amit Kapila.
How about renaming XLogFileRead() to XLogFileOpenForRead() and XLogFileOpen() to XLogFileOpenForWrite()?
Hi All, Currently, in postgres we have two different functions that are specially used to open the WAL files for reading and writing purposes. The first one is XLogFileOpen() that is just used to open the WAL file so that we can write WAL data in it. And then we have another function named XLogFileRead that does the same thing but is used when reading the WAL files during recovery time. How about renaming the function XLogFileRead to XLogFileOpenForRead and the other one can be renamed to XLogFileOpenForWrite. I think it will make the function name more clear and increase the readability. At least XlogFileRead doesn't look good to me, from the function name it actually appears like we are trying to read a WAL file here but actually we are opening it so that it can be read by some other routine. Also I see that we are passing emode to the XLogFileRead function which is not being used anywhere in the function, so can we remove it? -- With Regards, Ashutosh Sharma.
Re: How about renaming XLogFileRead() to XLogFileOpenForRead() and XLogFileOpen() to XLogFileOpenForWrite()?
On Tue, May 10, 2022 at 6:16 PM Ashutosh Sharma wrote: > > Hi All, > > Currently, in postgres we have two different functions that are > specially used to open the WAL files for reading and writing purposes. > The first one is XLogFileOpen() that is just used to open the WAL file > so that we can write WAL data in it. And then we have another function > named XLogFileRead that does the same thing but is used when reading > the WAL files during recovery time. How about renaming the function > XLogFileRead to XLogFileOpenForRead and the other one can be renamed > to XLogFileOpenForWrite. I think it will make the function name more > clear and increase the readability. At least XlogFileRead doesn't look > good to me, from the function name it actually appears like we are > trying to read a WAL file here but actually we are opening it so that > it can be read by some other routine. > > Also I see that we are passing emode to the XLogFileRead function > which is not being used anywhere in the function, so can we remove it? Renaming XLogFileOpen to XLogFileOpenForWrite while it uses O_RDWR, not O_RDWR is sort of conflicting. Also, I'm concerned that XLogFileOpen is an extern function, the external modules using it might break. XLogFileRead uses O_RDONLY and is a static function, so it might be okay to change the name, the only concern is that it creates diff with the older versions as we usually don't backport renaming functions or variables/code improvements/not-so-critical changes. Having said that, IMHO, the existing functions and their names look fine to me (developers can read the function/function comments to understand their usage though). Regards, Bharath Rupireddy.
Re: Proposal: add a debug message about using geqo
If we add that information to EXPLAIN output, the user won't need access to server logs. May be we need it in both the places. On Tue, May 10, 2022 at 6:35 AM KAWAMOTO Masaya wrote: > > Hi, > > During query tuning, users may want to check if GEQO is used or not > to generate a plan. However, users can not know it by simply counting > the number of tables that appear in SQL. I know we can know it by > enabling GEQO_DEBUG flag, but it needs recompiling, so I think it is > inconvenient. > > So, I would like to propose to add a debug level message that shows > when PostgreSQL use GEQO. That enables users to easily see it by > just changing log_min_messages. > > Use cases are as follows: > - When investigating about the result of planning, user can determine > whether the plan is chosen by the standard planning or GEQO. > > - When tuning PostgreSQL, user can determine the suitable value of > geqo_threshold parameter. > > Best regards. > > -- > KAWAMOTO Masaya > SRA OSS, Inc. Japan -- Best Wishes, Ashutosh Bapat
Re: psql now shows zero elapsed time after an error
Hello, Thanks for the catch and the proposed fix! Indeed, on errors the timing is not updated appropriately. ISTM that the best course is to update the elapsed time whenever a result is obtained, so that a sensible value is always available. See attached patch which is a variant of Richard's version. fabien=# SELECT 1 as one \; SELECT 1/0 \; SELECT 2 as two; ┌─┐ │ one │ ├─┤ │ 1 │ └─┘ (1 row) ERROR: division by zero Time: 0,352 ms Probably it would be appropriate to add a test case. I'll propose something later. -- Fabien.diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index feb1d547d4..773673cc62 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -1560,6 +1560,16 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g else result = PQgetResult(pset.db); + /* + * Get current timing measure in case an error occurs + */ + if (timing) + { +INSTR_TIME_SET_CURRENT(after); +INSTR_TIME_SUBTRACT(after, before); +*elapsed_msec = INSTR_TIME_GET_MILLISEC(after); + } + continue; } else if (svpt_gone_p && !*svpt_gone_p) @@ -1612,7 +1622,7 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g last = (next_result == NULL); /* - * Get timing measure before printing the last result. + * Update current timing measure. * * It will include the display of previous results, if any. * This cannot be helped because the server goes on processing @@ -1623,7 +1633,7 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g * With combined queries, timing must be understood as an upper bound * of the time spent processing them. */ - if (last && timing) + if (timing) { INSTR_TIME_SET_CURRENT(after); INSTR_TIME_SUBTRACT(after, before);
Re: Allowing REINDEX to have an optional name
On Tue, May 10, 2022 at 2:43 PM Simon Riggs wrote: > > A minor issue, and patch. > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > dbname, which makes this a little less usable than we might like. > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > DATABASE not much use in practice, and is the reason there is no test > for REINDEX DATABASE. Another reason why it is a little less usable > than we might like. > > Seems we should do something about these historic issues in the name > of product usability. > > Attached patch allows new syntax for REINDEX DATABASE, without needing > to specify dbname. That version of the command skips catalog tables, > as a way of avoiding the known deadlocks. Patch also adds a test. > >From the patch it looks like with the patch applied running REINDEX DATABASE is equivalent to running REINDEX DATABASE except reindexing the shared catalogs. Is that correct? Though the patch adds following change + Indexes on shared system catalogs are also processed, unless the + database name is omitted, in which case system catalog indexes are skipped. the syntax looks unintuitive. I think REINDEX DATABASE reindexing the current database is a good usability improvement in itself. But skipping the shared catalogs needs an explicity syntax. Not sure how feasible it is but something like REINDEX DATABASE skip SHARED/SYSTEM. -- Best Wishes, Ashutosh Bapat
Re: JSON constructors and window functions
Hi, On 4/4/22 2:19 PM, Andrew Dunstan wrote: On 4/4/22 12:33, Andres Freund wrote: It can, as Jaime's original post showed. But on further consideration I'm thinking this area needs some rework. ISTM that it might be a whole lot simpler and comprehensible to generate the json first without bothering about null values or duplicate keys and then in the finalizer check for null values to be skipped and duplicate keys. That way we would need to keep far less state for the aggregation functions, although it might be marginally less efficient. Thoughts? This is still on the open items list[1]. Given this is a user-triggerable crash and we are approaching PG15 Beta 1, I wanted to check in and see if there was any additional work required to eliminate the crash, or if the work at this point is just optimization. If the latter, I'd suggest we open up a new open item for it. Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items OpenPGP_signature Description: OpenPGP digital signature
Re: Column Filtering in Logical Replication
Hi, On 4/19/22 12:53 AM, Amit Kapila wrote: On Tue, Apr 19, 2022 at 6:58 AM Masahiko Sawada wrote: +1. I also think it's a bug so back-patching makes sense to me. Pushed. Thanks Tomas and Sawada-San. This is still on the PG15 open items list[1] though marked as with a fix. Did dd4ab6fd resolve the issue, or does this need more work? Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items OpenPGP_signature Description: OpenPGP digital signature
Re: Allowing REINDEX to have an optional name
On Tue, 10 May 2022 at 14:47, Ashutosh Bapat wrote: > > On Tue, May 10, 2022 at 2:43 PM Simon Riggs > wrote: > > > > A minor issue, and patch. > > > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > > dbname, which makes this a little less usable than we might like. > > > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > > DATABASE not much use in practice, and is the reason there is no test > > for REINDEX DATABASE. Another reason why it is a little less usable > > than we might like. > > > > Seems we should do something about these historic issues in the name > > of product usability. > > > > Attached patch allows new syntax for REINDEX DATABASE, without needing > > to specify dbname. That version of the command skips catalog tables, > > as a way of avoiding the known deadlocks. Patch also adds a test. > > > > From the patch it looks like with the patch applied running REINDEX > DATABASE is equivalent to running REINDEX DATABASE > except reindexing the shared catalogs. Is that correct? Yes > Though the patch adds following change > + Indexes on shared system catalogs are also processed, unless the > + database name is omitted, in which case system catalog indexes > are skipped. > > the syntax looks unintuitive. > > I think REINDEX DATABASE reindexing the current database is a good > usability improvement in itself. But skipping the shared catalogs > needs an explicity syntax. Not sure how feasible it is but something > like REINDEX DATABASE skip SHARED/SYSTEM. There are two commands: REINDEX DATABASE does every except system catalogs REINDEX SYSTEM does system catalogs only So taken together, the two commands seem intuitive to me. It is designed like this because it is dangerous to REINDEX the system catalogs because of potential deadlocks, so we want a way to avoid that problem. Perhaps I can improve the docs more, will look. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Asynchronous and "direct" IO support for PostgreSQL.
Hi Andres, > > The code is at > > https://github.com/anarazel/postgres/tree/aio > > Just FYI the cfbot says that this version of the patchset doesn't > apply anymore, and it seems that your branch was only rebased to > 43c1c4f (Sept. 21th) which doesn't rebase cleanly: After watching your recent talk "IO in PostgreSQL: Past, Present, Future" [1] I decided to invest some of my time into this patchset. It looks like at very least it could use a reviewer, or maybe two :) Unfortunately, it's a bit difficult to work with the patchset at the moment. Any chance we may expect a rebased version for the July CF? > Comments? Questions? Personally, I'm very enthusiastic about this patchset. However, a set of 39 patches seems to be unrealistic to test and/or review and/or keep up to date. The 64 bit XIDs patchset [2] is much less complicated, but still it got the feedback that it should be splitted to more patches and CF entries. Any chance we could decompose this effort? For instance, I doubt that we need all the backends in the first implementation. The fallback "worker" one, and io_uring one will suffice. Other backends can be added as separate features. Considering that in any case the "worker" backend shouldn't cause any significant performance degradation, maybe we could start even without io_uring. BTW, do we need Posix AIO at all, given your feedback on this API? Also, what if we migrate to AIO/DIO one part of the system at a time? As I understood from your talk, sequential scans will benefit most from AIO/DIO. Will it be possible to improve them first, while part of the system will continue using buffered IO? [1]: https://www.youtube.com/watch?v=3Oj7fBAqVTw [2]: https://commitfest.postgresql.org/38/3594/ -- Best regards, Aleksander Alekseev
Re: How about renaming XLogFileRead() to XLogFileOpenForRead() and XLogFileOpen() to XLogFileOpenForWrite()?
On Tue, May 10, 2022 at 6:46 PM Bharath Rupireddy wrote: > > On Tue, May 10, 2022 at 6:16 PM Ashutosh Sharma wrote: > > > > Hi All, > > > > Currently, in postgres we have two different functions that are > > specially used to open the WAL files for reading and writing purposes. > > The first one is XLogFileOpen() that is just used to open the WAL file > > so that we can write WAL data in it. And then we have another function > > named XLogFileRead that does the same thing but is used when reading > > the WAL files during recovery time. How about renaming the function > > XLogFileRead to XLogFileOpenForRead and the other one can be renamed > > to XLogFileOpenForWrite. I think it will make the function name more > > clear and increase the readability. At least XlogFileRead doesn't look > > good to me, from the function name it actually appears like we are > > trying to read a WAL file here but actually we are opening it so that > > it can be read by some other routine. > > > > Also I see that we are passing emode to the XLogFileRead function > > which is not being used anywhere in the function, so can we remove it? > > Renaming XLogFileOpen to XLogFileOpenForWrite while it uses O_RDWR, > not O_RDWR is sort of conflicting. Also, I'm concerned that > XLogFileOpen is an extern function, the external modules using it > might break. Why would the external modules open WAL files to perform write operations? AFAIU from the code, this function is specifically written to open WAL files during WAL write operation. So I don't see any problem in renaming it to XLogFileOpenForWrite(). Infact as I said, it does increase the readability. And likewise, XLogFileRead can be renamed to XLogFileOpenForRead which seems you are okay with. -- With Regards, Ashutosh Sharma.
Re: JSON constructors and window functions
On 2022-05-10 Tu 09:51, Jonathan S. Katz wrote: > Hi, > > On 4/4/22 2:19 PM, Andrew Dunstan wrote: >> >> On 4/4/22 12:33, Andres Freund wrote: > >> It can, as Jaime's original post showed. >> >> But on further consideration I'm thinking this area needs some rework. >> ISTM that it might be a whole lot simpler and comprehensible to generate >> the json first without bothering about null values or duplicate keys and >> then in the finalizer check for null values to be skipped and duplicate >> keys. That way we would need to keep far less state for the aggregation >> functions, although it might be marginally less efficient. Thoughts? > > This is still on the open items list[1]. Given this is a > user-triggerable crash and we are approaching PG15 Beta 1, I wanted to > check in and see if there was any additional work required to > eliminate the crash, or if the work at this point is just optimization. > > If the latter, I'd suggest we open up a new open item for it. > > Thanks, > > Jonathan > > [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items I believe all the issues here have been fixed. See commit 112fdb3528 cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: JSON constructors and window functions
On 5/10/22 10:25 AM, Andrew Dunstan wrote: I believe all the issues here have been fixed. See commit 112fdb3528 Thanks! I have updated Open Items. Jonathan OpenPGP_signature Description: OpenPGP digital signature
First draft of the PG 15 release notes
I have completed the first draft of the PG 15 release notes and you can see the results here: https://momjian.us/pgsql_docs/release-15.html The feature count is similar to recent major releases: release-10 195 release-11 185 release-12 198 release-13 183 release-14 229 --> release-15 186 I assume there will be major adjustments in the next few weeks based on feedback. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: Support logical replication of DDLs
> > > I agree that it adds to our maintainability effort, like every time we > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > probably need to change the deparsing code. But OTOH this approach > > > seems to provide better flexibility. So, in the long run, maybe the > > > effort is worthwhile. I am not completely sure at this stage which > > > approach is better but I thought it is worth evaluating this approach > > > as Alvaro and Robert seem to prefer this idea. > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > have to write the deparsing code for all the utility commands so there > > will be a huge amount of new code to maintain. > > Actually, the largest stumbling block on this, IMO, is having a good > mechanism to verify that all DDLs are supported. The deparsing code > itself is typically not very bad, as long as we have a sure way to twist > every contributor's hand into writing it, which is what an automated > test mechanism would give us. > > The code in test_ddl_deparse is a pretty lame start, not nearly good > enough by a thousand miles. My real intention was to have a test > harness that would first run a special SQL script to install DDL > capture, then run all the regular src/test/regress scripts, and then at > the end ensure that all the DDL scripts were properly reproduced -- for > example transmit them to another database, replay them there, and dump > both databases and compare them. However, there were challenges which I > no longer remember and we were unable to complete this, and we are where > we are. > > Thanks for rebasing that old code, BTW. I agree that deparsing could be a very useful utility on its own. Not only for SQL command replication between PostgreSQL servers, but also potentially feasible for SQL command replication between PotgreSQL and other database systems. For example, one could assemble the json representation of the SQL parse tree back to a SQL command that can be run in MySQL. But that requires different assembling rules and code for different database systems. If we're envisioning this kind of flexibility that the deparsing utility can offer, then I think it's better to develop the deparsing utility as an extension itself. Regards, Zheng
Re: First draft of the PG 15 release notes
I'm guessing this should be "trailing", not training? > Prevent numeric literals from having non-numeric training characters (Peter > Eisentraut)
Re: make MaxBackends available in _PG_init
On Tue, May 10, 2022 at 05:55:12PM +0900, Michael Paquier wrote: > I agree that removing support for the unloading part would be nice to > clean up now on HEAD. Note that 0002 is missing the removal of one > reference to _PG_fini in xfunc.sgml ( markup). Oops, sorry about that. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From e19bc075c5e98b655afa6ecc183c7de7adf217a8 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 18 Apr 2022 15:25:37 -0700 Subject: [PATCH v7 1/2] Add a new shmem_request_hook hook. Currently, preloaded libraries are expected to request additional shared memory and LWLocks in _PG_init(). However, it is not unusal for such requests to depend on MaxBackends, which won't be initialized at that time. Such requests could also depend on GUCs that other modules might change. This introduces a new hook where modules can safely use MaxBackends and GUCs to request additional shared memory and LWLocks. Furthermore, this change restricts requests for shared memory and LWLocks to this hook. Previously, libraries could make requests until the size of the main shared memory segment was calculated. Unlike before, we no longer silently ignore requests received at invalid times. Instead, we FATAL if someone tries to request additional shared memory or LWLocks outside of the hook. Authors: Julien Rouhaud, Nathan Bossart Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13 --- contrib/pg_prewarm/autoprewarm.c | 17 +++- .../pg_stat_statements/pg_stat_statements.c | 27 +-- src/backend/postmaster/postmaster.c | 5 src/backend/storage/ipc/ipci.c| 20 +- src/backend/storage/lmgr/lwlock.c | 18 - src/backend/utils/init/miscinit.c | 15 +++ src/include/miscadmin.h | 5 src/tools/pgindent/typedefs.list | 1 + 8 files changed, 72 insertions(+), 36 deletions(-) diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c index 45e012a63a..c0c4f5d9ca 100644 --- a/contrib/pg_prewarm/autoprewarm.c +++ b/contrib/pg_prewarm/autoprewarm.c @@ -96,6 +96,8 @@ static void apw_start_database_worker(void); static bool apw_init_shmem(void); static void apw_detach_shmem(int code, Datum arg); static int apw_compare_blockinfo(const void *p, const void *q); +static void autoprewarm_shmem_request(void); +static shmem_request_hook_type prev_shmem_request_hook = NULL; /* Pointer to shared-memory state. */ static AutoPrewarmSharedState *apw_state = NULL; @@ -139,13 +141,26 @@ _PG_init(void) MarkGUCPrefixReserved("pg_prewarm"); - RequestAddinShmemSpace(MAXALIGN(sizeof(AutoPrewarmSharedState))); + prev_shmem_request_hook = shmem_request_hook; + shmem_request_hook = autoprewarm_shmem_request; /* Register autoprewarm worker, if enabled. */ if (autoprewarm) apw_start_leader_worker(); } +/* + * Requests any additional shared memory required for autoprewarm. + */ +static void +autoprewarm_shmem_request(void) +{ + if (prev_shmem_request_hook) + prev_shmem_request_hook(); + + RequestAddinShmemSpace(MAXALIGN(sizeof(AutoPrewarmSharedState))); +} + /* * Main entry point for the leader autoprewarm process. Per-database workers * have a separate entry point. diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index df2ce63790..87b75d779e 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -252,6 +252,7 @@ static int exec_nested_level = 0; static int plan_nested_level = 0; /* Saved hook values in case of unload */ +static shmem_request_hook_type prev_shmem_request_hook = NULL; static shmem_startup_hook_type prev_shmem_startup_hook = NULL; static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL; static planner_hook_type prev_planner_hook = NULL; @@ -317,6 +318,7 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_10); PG_FUNCTION_INFO_V1(pg_stat_statements); PG_FUNCTION_INFO_V1(pg_stat_statements_info); +static void pgss_shmem_request(void); static void pgss_shmem_startup(void); static void pgss_shmem_shutdown(int code, Datum arg); static void pgss_post_parse_analyze(ParseState *pstate, Query *query, @@ -452,17 +454,11 @@ _PG_init(void) MarkGUCPrefixReserved("pg_stat_statements"); - /* - * Request additional shared resources. (These are no-ops if we're not in - * the postmaster process.) We'll allocate or attach to the shared - * resources in pgss_shmem_startup(). - */ - RequestAddinShmemSpace(pgss_memsize()); - RequestNamedLWLockTranche("pg_stat_statements", 1); - /* * Install hooks. */ + prev_shmem_request_hook = shmem_request_hook; + shmem_request_hook = pgss_shmem_request; prev_shmem_startup_hook = shmem_startup_hook; shmem_startup_hook = pgss_shmem_startup; prev_post_parse_analyze_hook = post_parse_analy
Re: Estimating HugePages Requirements?
On Mon, May 09, 2022 at 03:53:24PM +0900, Michael Paquier wrote: > I have looked at the patch posted at [1], and I don't quite understand > why you need the extra dance with log_min_messages. Why don't you > just set the GUC at the end of the code path in PostmasterMain() where > we print non-runtime-computed parameters? The log_min_messages dance avoids extra output when inspecting non-runtime-computed GUCs, like this: ~/pgdata$ postgres -D . -C log_min_messages -c log_min_messages=debug5 debug5 2022-05-10 09:06:04.728 PDT [3715607] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2022-05-10 09:06:04.728 PDT [3715607] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2022-05-10 09:06:04.728 PDT [3715607] DEBUG: proc_exit(0): 0 callbacks to make 2022-05-10 09:06:04.728 PDT [3715607] DEBUG: exit(0) AFAICT you need to set log_min_messages to at least DEBUG3 to see extra output for the non-runtime-computed GUCs, so it might not be worth the added complexity. > I am not really worrying > about users deciding to set log_min_messages to PANIC in > postgresql.conf when it comes to postgres -C, TBH, as they'd miss the > FATAL messages if the command is attempted on a server already > starting. I don't have a strong opinion on this one. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 04:58:50PM +0100, Geoff Winkless wrote: > I'm guessing this should be "trailing", not training? > > > Prevent numeric literals from having non-numeric training characters (Peter > > Eisentraut) Thanks, fixed. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
Op 10-05-2022 om 17:44 schreef Bruce Momjian: I have completed the first draft of the PG 15 release notes and you can see the results here: https://momjian.us/pgsql_docs/release-15.html typos: 'accept empty array' should be 'to accept empty array' 'super users' should be 'superusers' (several times) 'The new options causes the column names' 'The new options cause the column names' 'were always effected.' 'were always affected.' (I think...) 'Previous the actual schema' 'Previously the actual schema' 'inforcement' should be 'enforcement' (surely?) 'server slide' should be 'server side' 'Add extensions to define their own' should be 'Allow extensions to define their own' And one strangely unfinished sentence: 'They also can only be' Erik Rijkers
Re: First draft of the PG 15 release notes
Thanks for writting this. Comments from my first read: | This mode could cause server startup failure if the database server stopped abruptly while in this mode. This sentence both begins and ends with "this mode". | This allows query hash operations to use double the amount of work_mem memory as other operations. Should it say "node" ? | Allow tsvector_delete_arr() and tsvector_setweight_by_filter() accept empty array elements (Jean-Christophe Arnu) TO accept I don't think this should be in the "compatibility" section? | This accepts numeric formats like ".1" and "1.", and disallow trailing junk after numeric literals, like "1.type()". disallows with an ess | This will cause setseed() followed by random() to return a different value on older servers. *than* older servers | The Postgres default has always been to treat NULL indexed values as distinct, but this can now be changed by creating constraints and indexes using UNIQUE NULLS NOT DISTINCT. should not say default, since it wasn't previously configurable. "The previous behavior was ..." | Have extended statistics track statistics for a table's children separately (Tomas Vondra, Justin Pryzby) | Regular statistics already tracked child and non-child statistics separately. "separately" seems vague. Currently in v10-v13, extended stats are collected for partitioned tables, and collected for the "SELECT FROM ONLY"/parent case for inheritance parents. This changes to also collect stats for the "SELECT FROM tbl*" case. See also: 20220204214103.gt23...@telsasoft.com. | Allow members of the pg_checkpointer predefined role to run the CHECKPOINT command (Jeff Davis) | Previously these views could only be run by super users. | Allow members of the pg_read_all_stats predefined role to access the views pg_backend_memory_contexts and pg_shmem_allocations (Bharath Rupireddy) | Previously these views could only be run by super users. checkpoint is not a view (and views cannot be run) | Previously runtime-computed values data_checksums, wal_segment_size, and data_directory_mode would report values that would not be accurate on the running server. They also can only be be what ? | Add server variable allow_in_place_tablespaces for tablespace testing (Thomas Munro) This is a developer setting, so doesn't need to be mentioned ? | Add function pg_settings_get_flags() to get the flags of server-side variables (Justin Pryzby) IMO this is the same, but I think maybe Michael things about it differently... | Allow WAL full page writes to use LZ4 and ZSTD compression (Andrey Borodin, Justin Pryzby) | Add support for LZ4 and ZSTD compression of server-side base backups (Jeevan Ladhe, Robert Haas) | Allow pg_basebackup to decompress LZ4 and ZSTD compressed server-side base backups, and LZ4 and ZSTD compress output files (Dipesh Pandit, Jeevan Ladhe) | Add configure option --with-zstd to enable ZSTD build (Jeevan Ladhe, Robert Haas, Michael Paquier) Maybe these should say "Zstandard" ? See 586955dddecc95e0003262a3954ae83b68ce0372. | The new options causes the column names to be output, and optionally verified on input. option | Previous the actual schema name was used. Previously | When EXPLAIN references the temporary object schema, refer to it as "pg_temp" (Amul Sul) | When specifying fractional interval values in units greater than months, round to the nearest month (Bruce Momjian) | Limit support of psql to servers running PostgreSQL 9.2 and later (Tom Lane) | Limit support of pg_dump and pg_dumpall to servers running PostgreSQL 9.2 and later (Tom Lane) | Limit support of pg_upgrade to old servers running PostgreSQL 9.2 and later (Tom Lane) | Remove server support for old BASE_BACKUP command syntax and base backup protocol (Robert Haas) Do these need to be in the "compatibility" section ? | Fix inforcement of PL/pgSQL variable CONSTANT markings (Tom Lane) enforcement | Allow IP address matching against a server's certificate Subject Alternative Name (Jacob Champion) Should say "server certificate's" ? | Allow libpq's SSL private to be owned by the root user (David Steele) private *key* | Have psql output all output if multiple queries are passed to the server at once (Fabien Coelho) all *results* ? | This can be disabled setting SHOW_ALL_RESULTS. disabled *by* setting | Allow pg_basebackup's --compress option to control the compression method (Michael Paquier, Robert Haas) Should probably say "compression method and options" | Allow pg_basebackup to decompress LZ4 and ZSTD compressed server-side base backups, and LZ4 and ZSTD compress output files (Dipesh Pandit, Jeevan Ladhe) | Allow pg_basebackup to compress on the server slide and decompress on the client side before storage (Dipesh Pandit) Maybe these should be combined into one entry ? | Add the LZ4 compression method to pg_receivewal (Georgios Kokolatos) | This is enabled via --compression-method=lz4 an
Re: bogus: logical replication rows/cols combinations
On 5/9/22 05:45, Amit Kapila wrote: > On Sun, May 8, 2022 at 11:41 PM Tomas Vondra > wrote: >> >> On 5/7/22 07:36, Amit Kapila wrote: >>> On Mon, May 2, 2022 at 6:11 PM Alvaro Herrera >>> wrote: On 2022-May-02, Amit Kapila wrote: > I think it is possible to expose a list of publications for each > walsender as it is stored in each walsenders > LogicalDecodingContext->output_plugin_private. AFAIK, each walsender > can have one such LogicalDecodingContext and we can probably share it > via shared memory? I guess we need to create a DSM each time a walsender opens a connection, at START_REPLICATION time. Then ALTER PUBLICATION needs to connect to all DSMs of all running walsenders and see if they are reading from it. Is that what you have in mind? Alternatively, we could have one DSM per publication with a PID array of all walsenders that are sending it (each walsender needs to add its PID as it starts). The latter might be better. >>> >>> While thinking about using DSM here, I came across one of your commits >>> f2f9fcb303 which seems to indicate that it is not a good idea to rely >>> on it but I think you have changed dynamic shared memory to fixed >>> shared memory usage because that was more suitable rather than DSM is >>> not portable. Because I see a commit bcbd940806 where we have removed >>> the 'none' option of dynamic_shared_memory_type. So, I think it should >>> be okay to use DSM in this context. What do you think? >>> >> >> Why would any of this be needed? >> >> ALTER PUBLICATION will invalidate the RelationSyncEntry entries in all >> walsenders, no? So AFAICS it should be enough to enforce the limitations >> in get_rel_sync_entry, >> > > Yes, that should be sufficient to enforce limitations in > get_rel_sync_entry() but it will lead to the following behavior: > a. The Alter Publication command will be successful but later in the > logs, the error will be logged and the user needs to check it and take > appropriate action. Till that time the walsender will be in an error > loop which means it will restart and again lead to the same error till > the user takes some action. > b. As we use historic snapshots, so even after the user takes action > say by changing publication, it won't be reflected. So, the option for > the user would be to drop their subscription. > > Am, I missing something? If not, then are we okay with such behavior? > If yes, then I think it would be much easier implementation-wise and > probably advisable at this point. We can document it so that users are > careful and can take necessary action if they get into such a > situation. Any way we can improve this in future as you also suggested > earlier. > >> which is necessary anyway because the subscriber >> may not be connected when ALTER PUBLICATION gets executed. >> > > If we are not okay with the resultant behavior of detecting this in > get_rel_sync_entry(), then we can solve this in some other way as > Alvaro has indicated in one of his responses which is to detect that > at start replication time probably in the subscriber-side. > IMO that behavior is acceptable. We have to do that check anyway, and the subscription may start failing after ALTER PUBLICATION for a number of other reasons anyway so the user needs/should check the logs. And if needed, we can improve this and start doing the proactive-checks during ALTER PUBLICATION too. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Column Filtering in Logical Replication
On 5/10/22 15:55, Jonathan S. Katz wrote: > Hi, > > On 4/19/22 12:53 AM, Amit Kapila wrote: >> On Tue, Apr 19, 2022 at 6:58 AM Masahiko Sawada >> wrote: >>> >>> +1. I also think it's a bug so back-patching makes sense to me. >>> >> >> Pushed. Thanks Tomas and Sawada-San. > > This is still on the PG15 open items list[1] though marked as with a fix. > > Did dd4ab6fd resolve the issue, or does this need more work? > I believe that's fixed, the buildfarm does not seem to show any relevant failures in subscriptionCheck since dd4ab6fd got committed. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Column Filtering in Logical Replication
On 5/10/22 3:17 PM, Tomas Vondra wrote: On 5/10/22 15:55, Jonathan S. Katz wrote: Hi, On 4/19/22 12:53 AM, Amit Kapila wrote: On Tue, Apr 19, 2022 at 6:58 AM Masahiko Sawada wrote: +1. I also think it's a bug so back-patching makes sense to me. Pushed. Thanks Tomas and Sawada-San. This is still on the PG15 open items list[1] though marked as with a fix. Did dd4ab6fd resolve the issue, or does this need more work? I believe that's fixed, the buildfarm does not seem to show any relevant failures in subscriptionCheck since dd4ab6fd got committed. Great. I'm moving it off of open items. Thanks for confirming! Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 08:07:45PM +0200, Erik Rijkers wrote: > And one strangely unfinished sentence: > 'They also can only be' I suspect this meant to highlight that "postgres -C" with runtime-computed GUCs doesn't work if the server is running. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: First draft of the PG 15 release notes
Agreed on all of these, and URL contents updated. Thanks. --- On Tue, May 10, 2022 at 08:07:45PM +0200, Erik Rijkers wrote: > Op 10-05-2022 om 17:44 schreef Bruce Momjian: > > I have completed the first draft of the PG 15 release notes and you can > > see the results here: > > > > https://momjian.us/pgsql_docs/release-15.html > > typos: > > 'accept empty array' should be > 'to accept empty array' > > 'super users' should be > 'superusers' >(several times) > > 'The new options causes the column names' > 'The new options cause the column names' > > 'were always effected.' > 'were always affected.' >(I think...) > > 'Previous the actual schema' > 'Previously the actual schema' > > 'inforcement' should be > 'enforcement' >(surely?) > > 'server slide' should be > 'server side' > > 'Add extensions to define their own' should be > 'Allow extensions to define their own' > > > And one strangely unfinished sentence: > 'They also can only be' > > > Erik Rijkers > > -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On 5/10/22 11:44 AM, Bruce Momjian wrote: I have completed the first draft of the PG 15 release notes and you can see the results here: https://momjian.us/pgsql_docs/release-15.html Thanks for pulling this together. + Allow logical replication to transfer sequence changes I believe this was reverted in 2c7ea57e5, unless some other parts of this work made it in. Thanks, Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 01:09:35PM -0500, Justin Pryzby wrote: > Thanks for writting this. > > Comments from my first read: > > | This mode could cause server startup failure if the database server stopped > abruptly while in this mode. > > This sentence both begins and ends with "this mode". Agreed, I rewrote this. > | This allows query hash operations to use double the amount of work_mem > memory as other operations. > > Should it say "node" ? Uh, I think users think of things like operations, e.g. sort operation vs sort node. > | Allow tsvector_delete_arr() and tsvector_setweight_by_filter() accept > empty array elements (Jean-Christophe Arnu) > > TO accept > I don't think this should be in the "compatibility" section? Yes, moved to Function. > | This accepts numeric formats like ".1" and "1.", and disallow trailing > junk after numeric literals, like "1.type()". > > disallows with an ess Fixed. > | This will cause setseed() followed by random() to return a different value > on older servers. > > *than* older servers Fixed. > | The Postgres default has always been to treat NULL indexed values as > distinct, but this can now be changed by creating constraints and indexes > using UNIQUE NULLS NOT DISTINCT. > > should not say default, since it wasn't previously configurable. > "The previous behavior was ..." Agreed, reworded. > | Have extended statistics track statistics for a table's children separately > (Tomas Vondra, Justin Pryzby) > | Regular statistics already tracked child and non-child statistics > separately. > > "separately" seems vague. Currently in v10-v13, extended stats are collected > for partitioned tables, and collected for the "SELECT FROM ONLY"/parent case > for inheritance parents. This changes to also collect stats for the "SELECT > FROM tbl*" case. See also: 20220204214103.gt23...@telsasoft.com. Agreed, reworded. Can you check you like my new wording? > | Allow members of the pg_checkpointer predefined role to run the CHECKPOINT > command (Jeff Davis) > | Previously these views could only be run by super users. > | Allow members of the pg_read_all_stats predefined role to access the views > pg_backend_memory_contexts and pg_shmem_allocations (Bharath Rupireddy) > | Previously these views could only be run by super users. > > checkpoint is not a view (and views cannot be run) Fixed, was copy/paste error. > | Previously runtime-computed values data_checksums, wal_segment_size, and > data_directory_mode would report values that would not be accurate on the > running server. They also can only be > > be what ? Removed. > | Add server variable allow_in_place_tablespaces for tablespace testing > (Thomas Munro) > > This is a developer setting, so doesn't need to be mentioned ? Moved to Source Code. > | Add function pg_settings_get_flags() to get the flags of server-side > variables (Justin Pryzby) > > IMO this is the same, but I think maybe Michael things about it differently... Uh, I thought it might hvae user value as well as developer. > | Allow WAL full page writes to use LZ4 and ZSTD compression (Andrey Borodin, > Justin Pryzby) > | Add support for LZ4 and ZSTD compression of server-side base backups > (Jeevan Ladhe, Robert Haas) > | Allow pg_basebackup to decompress LZ4 and ZSTD compressed server-side base > backups, and LZ4 and ZSTD compress output files (Dipesh Pandit, Jeevan Ladhe) > | Add configure option --with-zstd to enable ZSTD build (Jeevan Ladhe, Robert > Haas, Michael Paquier) > > Maybe these should say "Zstandard" ? See > 586955dddecc95e0003262a3954ae83b68ce0372. I wasn't aware that ZSTD stood for that, so updated. > | The new options causes the column names to be output, and optionally > verified on input. > > option Fixed. > > | Previous the actual schema name was used. > > Previously Fixed. > | When EXPLAIN references the temporary object schema, refer to it as > "pg_temp" (Amul Sul) > | When specifying fractional interval values in units greater than months, > round to the nearest month (Bruce Momjian) > | Limit support of psql to servers running PostgreSQL 9.2 and later (Tom > Lane) > | Limit support of pg_dump and pg_dumpall to servers running PostgreSQL 9.2 > and later (Tom Lane) > | Limit support of pg_upgrade to old servers running PostgreSQL 9.2 and later > (Tom Lane) > | Remove server support for old BASE_BACKUP command syntax and base backup > protocol (Robert Haas) > > Do these need to be in the "compatibility" section ? Uh, I think of compatibility as breakage, while removing support for something doesn't seem like breakage. The protocol removal of BASE_BACKUP only relates to people writing tools, I thought, so no breakage for non-internals users. I didn't think the fractional interval change would be a breakage, though maybe it is. I didn't think EXPLAIN changes were user-parsed, so no breakage? > | Fix inforcement of PL/pgSQL variable CONSTANT marking
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 12:44:56PM -0700, Nathan Bossart wrote: > On Tue, May 10, 2022 at 08:07:45PM +0200, Erik Rijkers wrote: > > And one strangely unfinished sentence: > > 'They also can only be' > > I suspect this meant to highlight that "postgres -C" with runtime-computed > GUCs doesn't work if the server is running. Yes, you are correct --- wording updated. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 04:17:59PM -0400, Jonathan Katz wrote: > On 5/10/22 11:44 AM, Bruce Momjian wrote: > > I have completed the first draft of the PG 15 release notes and you can > > see the results here: > > > > https://momjian.us/pgsql_docs/release-15.html > > Thanks for pulling this together. > > + Allow logical replication to transfer sequence changes > > I believe this was reverted in 2c7ea57e5, unless some other parts of this > work made it in. Yes, sorry, I missed that. Oddly, the unlogged sequence patch was retained, even though there is no value for it on the primary. I removed the sentence that mentioned that benefit from the release notes since it doesn't apply to PG 15 anymore. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 04:18:10PM -0400, Bruce Momjian wrote: > > "separately" seems vague. Currently in v10-v13, extended stats are > > collected > > for partitioned tables, and collected for the "SELECT FROM ONLY"/parent case > > for inheritance parents. This changes to also collect stats for the "SELECT > > FROM tbl*" case. See also: 20220204214103.gt23...@telsasoft.com. > > Agreed, reworded. Can you check you like my new wording? Now it says: | Allow extended statistics to record statistics for a parent with all it children (Tomas Vondra, Justin Pryzby) it should say "its" children. > > | Add function pg_settings_get_flags() to get the flags of server-side > > variables (Justin Pryzby) > > > > IMO this is the same, but I think maybe Michael things about it > > differently... > > Uh, I thought it might hvae user value as well as developer. The list of flags it includes is defined as "the flags we needed to deprecate ./check_guc", but it could conceivably be useful to someone else... But it seems more like allow_in_place_tablespaces; it could go into the "source code" section, or be removed. > > | When EXPLAIN references the temporary object schema, refer to it as > > "pg_temp" (Amul Sul) > > | When specifying fractional interval values in units greater than months, > > round to the nearest month (Bruce Momjian) > > | Limit support of psql to servers running PostgreSQL 9.2 and later (Tom > > Lane) > > | Limit support of pg_dump and pg_dumpall to servers running PostgreSQL 9.2 > > and later (Tom Lane) > > | Limit support of pg_upgrade to old servers running PostgreSQL 9.2 and > > later (Tom Lane) > > | Remove server support for old BASE_BACKUP command syntax and base backup > > protocol (Robert Haas) > > > > Do these need to be in the "compatibility" section ? > > Uh, I think of compatibility as breakage, while removing support for > something doesn't seem like breakage. I think removing support which breaks a user-facing behavior is presumptively a compatibility issue. > I didn't think EXPLAIN changes were user-parsed, so no breakage? Why would we have explain(format json/xml) if it wasn't meant to be parsed ? At one point I was parsing its xml. I'll let other's comment about the rest of the list. > > | Automatically export server variables using PGDLLIMPORT on Windows > > (Robert Haas) > > > > I don't think it's "automatic" ? > > Yes, reworded. Maybe it's a tiny bit better to say: | Export all server variables on Windows using PGDLLIMPORT (Robert Haas) (Otherwise, "all server variables using PGDLLIMPORT" could sound like only those "server variables [which were] using PGDLLIMPORT" were exported). > > | Allow informational escape sequences to be used in postgres_fdw's > > application name (Hayato Kuroda, Fujii Masao) > > > > I don't think this should be a separate entry > > Uh, the entry above is about per-connection application name, while this > is about escapes --- seems different to me, and hard to combine. 449ab635052 postgres_fdw: Allow application_name of remote connection to be set via GUC. 6e0cb3dec10 postgres_fdw: Allow postgres_fdw.application_name to include escape sequences. 94c49d53402 postgres_fdw: Make postgres_fdw.application_name support more escape sequences. You have one entry for 449a, and one entry where you've combined 6e0c and 94c4. My point is that the 2nd two commits changed the behavior of the first commit, and I don't see why an end-user would want to know about the intermediate behavior from the middle of the development cycle when escape sequences weren't expanded. So I don't know why they'd be listed separately. -- Justin
Re: First draft of the PG 15 release notes
> I have completed the first draft of the PG 15 release notes and you can > see the results here: > > https://momjian.us/pgsql_docs/release-15.html > Allow pgbench to retry after serialization and deadlock failures (Yugo > Nagata, Marina Polyakova) This is in the "Additional Modules" section. I think this should be in the "Client Applications" section because pgbench lives in bin directory, not in contrib directory. Actually, pgbench was in the "Client Applications" section in the PG 14 release note. Best reagards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 04:02:35PM -0500, Justin Pryzby wrote: > On Tue, May 10, 2022 at 04:18:10PM -0400, Bruce Momjian wrote: > > > "separately" seems vague. Currently in v10-v13, extended stats are > > > collected > > > for partitioned tables, and collected for the "SELECT FROM ONLY"/parent > > > case > > > for inheritance parents. This changes to also collect stats for the > > > "SELECT > > > FROM tbl*" case. See also: 20220204214103.gt23...@telsasoft.com. > > > > Agreed, reworded. Can you check you like my new wording? > > Now it says: > | Allow extended statistics to record statistics for a parent with all it > children (Tomas Vondra, Justin Pryzby) > > it should say "its" children. Fixed. > > > | Add function pg_settings_get_flags() to get the flags of server-side > > > variables (Justin Pryzby) > > > > > > IMO this is the same, but I think maybe Michael things about it > > > differently... > > > > Uh, I thought it might hvae user value as well as developer. > > The list of flags it includes is defined as "the flags we needed to deprecate > ./check_guc", but it could conceivably be useful to someone else... But it > seems more like allow_in_place_tablespaces; it could go into the "source code" > section, or be removed. Okay, I moved into source code. > > > | When EXPLAIN references the temporary object schema, refer to it as > > > "pg_temp" (Amul Sul) > > > | When specifying fractional interval values in units greater than > > > months, round to the nearest month (Bruce Momjian) > > > | Limit support of psql to servers running PostgreSQL 9.2 and later (Tom > > > Lane) > > > | Limit support of pg_dump and pg_dumpall to servers running PostgreSQL > > > 9.2 and later (Tom Lane) > > > | Limit support of pg_upgrade to old servers running PostgreSQL 9.2 and > > > later (Tom Lane) > > > | Remove server support for old BASE_BACKUP command syntax and base > > > backup protocol (Robert Haas) > > > > > > Do these need to be in the "compatibility" section ? > > > > Uh, I think of compatibility as breakage, while removing support for > > something doesn't seem like breakage. > > I think removing support which breaks a user-facing behavior is presumptively > a > compatibility issue. I moved the EXPLAIN and fractional interval items to the compatibility section. I didn't change the psql and pg_dump items since they are already in their own sections and are clearly support removal rather than direct changes in behavior that would be expected. However, if someone else feels they should be moved, I will move them, so someone please reply if you feel that way. > > I didn't think EXPLAIN changes were user-parsed, so no breakage? > > Why would we have explain(format json/xml) if it wasn't meant to be parsed ? > At one point I was parsing its xml. > > I'll let other's comment about the rest of the list. Good point on the formatted EXPLAIN output being affected, which is why moving it does make sense. > > > | Automatically export server variables using PGDLLIMPORT on Windows > > > (Robert Haas) > > > > > > I don't think it's "automatic" ? > > > > Yes, reworded. > > Maybe it's a tiny bit better to say: > | Export all server variables on Windows using PGDLLIMPORT (Robert Haas) > > (Otherwise, "all server variables using PGDLLIMPORT" could sound like only > those "server variables [which were] using PGDLLIMPORT" were exported). Ah, yes, I see the improvement, done. > > > | Allow informational escape sequences to be used in postgres_fdw's > > > application name (Hayato Kuroda, Fujii Masao) > > > > > > I don't think this should be a separate entry > > > > Uh, the entry above is about per-connection application name, while this > > is about escapes --- seems different to me, and hard to combine. > > 449ab635052 postgres_fdw: Allow application_name of remote connection to be > set via GUC. > 6e0cb3dec10 postgres_fdw: Allow postgres_fdw.application_name to include > escape sequences. > 94c49d53402 postgres_fdw: Make postgres_fdw.application_name support more > escape sequences. > > You have one entry for 449a, and one entry where you've combined 6e0c and > 94c4. > > My point is that the 2nd two commits changed the behavior of the first commit, > and I don't see why an end-user would want to know about the intermediate > behavior from the middle of the development cycle when escape sequences > weren't > expanded. So I don't know why they'd be listed separately. I see your point --- postgres_fdw.application_name supports escapes that the normal application_name does not. I combined all three items now. Thanks. The new entry is: Add server variable postgres_fdw.application_name to control the application name of postgres_fdw connections (Hayato Kuroda) Previously the remote application_name could only be set on the remote server or via postgres_fdw connection speci
Re: JSON Functions and Operators Docs for v15
On 2022-05-04 We 15:14, Andrew Dunstan wrote: > On 2022-05-04 We 11:39, Tom Lane wrote: >> "David G. Johnston" writes: >>> Is there a thread I'm not finding where the upcoming JSON function >>> documentation is being made reasonably usable after doubling its size with >>> all the new JSON Table features that we've added? If nothing else, the >>> table of contents at the top of the page needs to be greatly expanded to >>> make seeing and navigating to all that is available a possibility. >> The entire structure of that text needs to be rethought, IMO, as it >> has been written with precisely no concern for fitting into our >> hard-won structure for func.sgml. Andrew muttered something about >> rewriting it awhile ago, but I don't know what progress he's made. >> > Yes, I've been clearing the decks a bit, but I'm working on it now, > should have something within the next week. Running slightly long on this. Will definitely post a patch by COB Friday. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: First draft of the PG 15 release notes
>> I have completed the first draft of the PG 15 release notes and you can >> see the results here: >> >> https://momjian.us/pgsql_docs/release-15.html > >> Allow pgbench to retry after serialization and deadlock failures (Yugo >> Nagata, Marina Polyakova) > > This is in the "Additional Modules" section. I think this should be in > the "Client Applications" section because pgbench lives in bin > directory, not in contrib directory. Actually, pgbench was in the > "Client Applications" section in the PG 14 release note. I think you missed this: commit 06ba4a63b85e5aa47b325c3235c16c05a0b58b96 Use COPY FREEZE in pgbench for faster benchmark table population. Best reagards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: First draft of the PG 15 release notes
On Wed, May 11, 2022 at 06:47:13AM +0900, Tatsuo Ishii wrote: > >> I have completed the first draft of the PG 15 release notes and you can > >> see the results here: > >> > >> https://momjian.us/pgsql_docs/release-15.html > > > >> Allow pgbench to retry after serialization and deadlock failures (Yugo > >> Nagata, Marina Polyakova) > > > > This is in the "Additional Modules" section. I think this should be in > > the "Client Applications" section because pgbench lives in bin > > directory, not in contrib directory. Actually, pgbench was in the > > "Client Applications" section in the PG 14 release note. > > I think you missed this: > > commit 06ba4a63b85e5aa47b325c3235c16c05a0b58b96 > Use COPY FREEZE in pgbench for faster benchmark table population. I didn't mention it since it is automatic and I didn't think pgbench load time was significant enough to mention. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 05:45:01PM -0400, Bruce Momjian wrote: > I see your point --- postgres_fdw.application_name supports escapes that > the normal application_name does not. I combined all three items now. > Thanks. The new entry is: The URL is updated with the current commits: https://momjian.us/pgsql_docs/release-15.html -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
> On May 10, 2022, at 8:44 AM, Bruce Momjian wrote: > > I have completed the first draft of the PG 15 release notes and you can > see the results here Thanks, Bruce! This release note: • Prevent logical replication into tables where the subscription owner is subject to the table's row-level security policies (Mark Dilger) ... should mention, independent of any RLS considerations, subscriptions are now applied under the privilege of the subscription owner. I don't think we can fit it in the release note, but the basic idea is that: CREATE SUBSCRIPTION ... CONNECTION '...' PUBLICATION ... WITH (enabled = false); ALTER SUBSCRIPTION ... OWNER TO nonsuperuser_whoever; ALTER SUBSCRIPTION ... ENABLE; can be used to replicate a subscription without sync or apply workers operating as superuser. That's the main advantage. Previously, subscriptions always ran with superuser privilege, which creates security concerns if the publisher is malicious (or foolish). Avoiding any unintentional bypassing of RLS was just a necessary detail to close the security loophole, not the main point of the security enhancement. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: configure openldap crash warning
I wrote: > OK. I will push the configure change once the release freeze lifts. And done. regards, tom lane
Re: BufferAlloc: don't take two simultaneous locks
В Пт, 06/05/2022 в 10:26 -0400, Robert Haas пишет: > On Thu, Apr 21, 2022 at 6:58 PM Yura Sokolov wrote: > > At the master state: > > - SharedBufHash is not declared as HASH_FIXED_SIZE > > - get_hash_entry falls back to element_alloc too fast (just if it doesn't > > found free entry in current freelist partition). > > - get_hash_entry has races. > > - if there are small number of spare items (and NUM_BUFFER_PARTITIONS is > > small number) and HASH_FIXED_SIZE is set, it becomes contended and > > therefore slow. > > > > HASH_REUSE solves (for shared buffers) most of this issues. Free list > > became rare fallback, so HASH_FIXED_SIZE for SharedBufHash doesn't lead > > to performance hit. And with fair number of spare items, get_hash_entry > > will find free entry despite its races. > > Hmm, I see. The idea of trying to arrange to reuse entries rather than > pushing them onto a freelist and immediately trying to take them off > again is an interesting one, and I kind of like it. But I can't > imagine that anyone would commit this patch the way you have it. It's > way too much action at a distance. If any ereport(ERROR,...) could > happen between the HASH_REUSE operation and the subsequent HASH_ENTER, > it would be disastrous, and those things are separated by multiple > levels of call stack across different modules, so mistakes would be > easy to make. If this could be made into something dynahash takes care > of internally without requiring extensive cooperation with the calling > code, I think it would very possibly be accepted. > > One approach would be to have a hash_replace() call that takes two > const void * arguments, one to delete and one to insert. Then maybe > you propagate that idea upward and have, similarly, a BufTableReplace > operation that uses that, and then the bufmgr code calls > BufTableReplace instead of BufTableDelete. Maybe there are other > better ideas out there... No. While HASH_REUSE is a good addition to overall performance improvement of the patch, it is not required for major gain. Major gain is from not taking two partition locks simultaneously. hash_replace would require two locks, so it is not an option. regards - Yura
Re: Query generates infinite loop
> > Less sure about that. ISTM the reason that the previous proposal failed > was that it introduced too much ambiguity about how to resolve > unknown-type arguments. Wouldn't the same problems arise here? > If I recall, the problem was that the lack of a date-specific generate_series function would result in a date value being coerced to timestamp, and thus adding generate_series(date, date, step) would change behavior of existing code, and that was a POLA violation (among other bad things). By adding a different function, there is no prior behavior to worry about. So we should be safe with the following signatures doing the right thing, yes?: generate_finite_series(start timestamp, step interval, num_elements integer) generate_finite_series(start date, step integer, num_elements integer) generate_finite_series(start date, step interval year to month, num_elements integer)
Re: Query generates infinite loop
Corey Huinker writes: >> Less sure about that. ISTM the reason that the previous proposal failed >> was that it introduced too much ambiguity about how to resolve >> unknown-type arguments. Wouldn't the same problems arise here? > By adding a different function, there is no prior behavior to worry about. True, that's one less thing to worry about. > So we should be safe with the following signatures doing the right thing, > yes?: > generate_finite_series(start timestamp, step interval, num_elements > integer) > generate_finite_series(start date, step integer, num_elements integer) > generate_finite_series(start date, step interval year to month, > num_elements integer) No. You can experiment with it easily enough using stub functions: regression=# create function generate_finite_series(start timestamp, step interval, num_elements regression(# integer) returns timestamp as 'select $1' language sql; CREATE FUNCTION regression=# create function generate_finite_series(start date, step integer, num_elements integer) returns timestamp as 'select $1' language sql; CREATE FUNCTION regression=# create function generate_finite_series(start date, step interval year to month, regression(# num_elements integer) returns timestamp as 'select $1' language sql;; CREATE FUNCTION regression=# select generate_finite_series(current_date, '1 day', 10); ERROR: function generate_finite_series(date, unknown, integer) is not unique LINE 1: select generate_finite_series(current_date, '1 day', 10); ^ HINT: Could not choose a best candidate function. You might need to add explicit type casts. It's even worse if the first argument is also an unknown-type literal. Sure, you could add explicit casts to force the choice of variant, but then ease of use went out the window somewhere --- and IMO this proposal is mostly about ease of use, since there's no fundamentally new functionality. It looks like you could make it work with just these three variants: regression=# \df generate_finite_series List of functions Schema | Name | Result data type | Argument data types | Type ++-++-- public | generate_finite_series | timestamp without time zone | start date, step interval, num_elements integer| func public | generate_finite_series | timestamp with time zone| start timestamp with time zone, step interval, num_elements integer| func public | generate_finite_series | timestamp without time zone | start timestamp without time zone, step interval, num_elements integer | func (3 rows) I get non-error results with these: regression=# select generate_finite_series(current_date, '1 day', 10); generate_finite_series 2022-05-10 00:00:00 (1 row) regression=# select generate_finite_series('now', '1 day', 10); generate_finite_series --- 2022-05-10 19:35:33.773738-04 (1 row) That shows that an unknown-type literal in the first argument will default to timestamptz given these choices, which seems like a sane default. BTW, you don't get to say "interval year to month" as a function argument, or at least it won't do anything useful. If you want to restrict the contents of the interval it'll have to be a runtime check inside the function. regards, tom lane
Re: First draft of the PG 15 release notes
On 5/10/22 4:18 PM, Bruce Momjian wrote: On Tue, May 10, 2022 at 01:09:35PM -0500, Justin Pryzby wrote: | Allow libpq's SSL private to be owned by the root user (David Steele) private *key* I changed it to "private key file". This was backpatched to all supported versions[1]. While I'm a huge fan of this behavior change for a plethora of reasons, I'm not sure if this should be included as part of the PG15 release notes, given it's in the 14.3 et al. Thanks, Jonathan [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2a1f84636dc335a3edf53a8361ae44bb2ae00093 OpenPGP_signature Description: OpenPGP digital signature
Re: First draft of the PG 15 release notes
On Wed, 11 May 2022 at 03:44, Bruce Momjian wrote: > > I have completed the first draft of the PG 15 release notes and you can > see the results here: > > https://momjian.us/pgsql_docs/release-15.html Thanks for doing that tedious work. I think the sort improvements done in v15 are worth a mention under General Performance. The commits for this were 91e9e89dc, 40af10b57 and 697492434. I've been running a few benchmarks between v14 and v15 over the past few days and a fairly average case speedup is about 25%. but there are cases where I've seen up to 400%. I think the increase is to an extent that we maybe should have considered making tweaks in cost_tuplesort(). I saw some plans that ran in about 60% of the time by disabling Hash Agg and allowing Sort / Group Agg to do the work. David
Re: First draft of the PG 15 release notes
"Jonathan S. Katz" writes: > On 5/10/22 4:18 PM, Bruce Momjian wrote: > | Allow libpq's SSL private to be owned by the root user (David Steele) > This was backpatched to all supported versions[1]. While I'm a huge fan > of this behavior change for a plethora of reasons, I'm not sure if this > should be included as part of the PG15 release notes, given it's in the > 14.3 et al. It should not. However, the backpatch happened later than the commit to HEAD, and our git_changelog tool is not smart enough to match them up, so Bruce didn't see that there were followup commits. That's a generic hazard for major-release notes; if you spot any other cases please do mention them. regards, tom lane
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 04:18:10PM -0400, Bruce Momjian wrote: > > | Add the LZ4 compression method to pg_receivewal (Georgios Kokolatos) > > | This is enabled via --compression-method=lz4 and requires binaries to be > > built using --with-lz4. > > | Redesign pg_receivewal's compression options (Georgios Kokolatos) > > | The new --compression-method option controls the type of compression, > > rather than just relying on --compress. > > > > It's --compress since 042a923ad. > > Yep, fixed. It now says: | The new --compression option controls the type of compression, rather than just relying on --compress. But the option is --compress, and not --compression.
Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set
On Mon, May 09, 2022 at 12:18:39PM +0900, Michael Paquier wrote: > All these fixes lead me to the attached patch. I have applied this stuff as of 7dd3ee5, in time for beta1, and closed the open item. One difference is that I've added one backslash surrounding the double quote at the beginning *and* the end of the database name in the patch. However, the original case was different, with: - At the beginning of the database name, one backslash before and after the double quote. - At the end of the database name, two backslaces before the double quote and three after the double quote. -- Michael signature.asc Description: PGP signature
Re: First draft of the PG 15 release notes
| Remove incorrect duplicate partition tables in system view pg_publication_tables (Hou Zhijie) should say "partitions" ? "Do not show partitions whose parents are also published" (is that accurate?) | Allow system and TOAST B-tree indexes to efficiently store duplicates (Peter Geoghegan) | Previously de-duplication was disabled for these types of indexes. I think the user-facing change here is that (in addition to being "allowed"), it's now enabled by default for catalog indexes. "Enable de-duplication of system indexes by default". | Prevent changes to columns only indexed by BRIN indexes from preventing HOT updates (Josef Simanek) says "prevent" twice. "Allow HOT updates when changed columns are only indexed by BRIN indexes" (or "avoid precluding...") | Improve the performance of window functions that use row_number(), rank(), and count() (David Rowley) The essential feature is a new kind of "prosupport", which is implemented for those core functions. I suggest to add another sentence about how prosupport can also be added to user-defined/non-core functions. | Store server-level statistics in shared memory (Kyotaro Horiguchi, Andres Freund, Melanie Plageman) Should this be called "cumulative" statistics? As in b3abca68106d518ce5d3c0d9a1e0ec02a647ceda. | Allows view access to be controlled by privileges of the view user (Christoph Heiss) Allow | New function "The new function .." (a few times) | Improve the parallel pg_dump performance of TOAST tables (Tom Lane) I don't think this needs to be mentioned, unless maybe folded into an entry like "improve performance when dumping with many objects or relations with large toast tables". | Allow pg_basebackup to decompress LZ4 and Zstandard compressed server-side base backups, and LZ4 and Zstandard compress output files (Dipesh Pandit, Jeevan Ladhe) maybe: "... and to compress output files with LZ4 and Zstandard." | Add direct I/O support to macOS (Thomas Munro) | This only works if max_wal_senders=0 and wal_level=minimal. I think this should mention that it's only for WAL. | Remove status reporting during pg_upgrade operation if the output is not a terminal (Andres Freund) Maybe: "By default, do not output status information unless the output is a terminal" | Add new protocol message COMPRESSION and COMPRESSION_DETAIL to specify the compression method and level (Robert Haas) s/level/options/ ? | Prevent DROP DATABASE, DROP TABLESPACE, and ALTER DATABASE SET TABLESPACE from occasionally failing during concurrent use on Windows (Thomas Munro) Maybe this doesn't need to be mentioned ? | Fix pg_statio_all_tables to sum values for the rare case of TOAST tables with multiple indexes (Andrei Zubkov) | Previously such cases would have one row for each index. Doesn't need to be mentioned ? It doesn't seem like a "compatibility" issue anyway. Should this be included? 6b94e7a6da2 Consider fractional paths in generate_orderedappend_paths Should any of these be listed as incompatible changes (some of these I asked before, but the others are from another list). 95ab1e0a9db interval: round values when spilling to months 9cd28c2e5f1 Remove server support for old BASE_BACKUP command syntax. 0d4513b6138 Remove server support for the previous base backup protocol. ccd10a9bfa5 Fix enforcement of PL/pgSQL variable CONSTANT markings (Tom Lane) 38bfae36526 pg_upgrade: Move all the files generated internally to a subdirectory 376ce3e404b Prefer $HOME when looking up the current user's home directory. 7844c9918a4 psql: Show all query results by default 17a856d08be Change aggregated log format of pgbench. ? 73508475d69 Remove pg_atoi() ? aa64f23b029 Remove MaxBackends variable in favor of GetMaxBackends() function. ? d816f366bc4 psql: Make SSL info display more compact ? 27b02e070fd pg_upgrade: Don't print progress status when output is not a tty. ? ab4fd4f868e Remove 'datlastsysoid'.
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 03:12:18PM -0700, Mark Dilger wrote: > > > > On May 10, 2022, at 8:44 AM, Bruce Momjian wrote: > > > > I have completed the first draft of the PG 15 release notes and you can > > see the results here > > > Thanks, Bruce! This release note: > > • Prevent logical replication into tables where the subscription owner > is subject to the table's row-level security policies (Mark Dilger) > > ... should mention, independent of any RLS considerations, subscriptions are > now applied under the privilege of the subscription owner. I don't think we > can fit it in the release note, but the basic idea is that: > > CREATE SUBSCRIPTION ... CONNECTION '...' PUBLICATION ... WITH (enabled > = false); > ALTER SUBSCRIPTION ... OWNER TO nonsuperuser_whoever; > ALTER SUBSCRIPTION ... ENABLE; > > can be used to replicate a subscription without sync or apply workers > operating as superuser. That's the main advantage. Previously, > subscriptions always ran with superuser privilege, which creates security > concerns if the publisher is malicious (or foolish). Avoiding any > unintentional bypassing of RLS was just a necessary detail to close the > security loophole, not the main point of the security enhancement. Oh, interesting. New text: Allow logical replication to run as the owner of the publication (Mark Dilger) Because row-level security policies are not checked, only superusers, roles with bypassrls, and table owners can replicate into tables with row-level security policies. How is this? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 08:14:28PM -0400, Jonathan Katz wrote: > On 5/10/22 4:18 PM, Bruce Momjian wrote: > > On Tue, May 10, 2022 at 01:09:35PM -0500, Justin Pryzby wrote: > > > > > > | Allow libpq's SSL private to be owned by the root user (David Steele) > > > > > > private *key* > > > > I changed it to "private key file". > > This was backpatched to all supported versions[1]. While I'm a huge fan of > this behavior change for a plethora of reasons, I'm not sure if this should > be included as part of the PG15 release notes, given it's in the 14.3 et al. Right, is should be removed, and I have done so. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 09:16:33PM -0400, Tom Lane wrote: > "Jonathan S. Katz" writes: > > On 5/10/22 4:18 PM, Bruce Momjian wrote: > > | Allow libpq's SSL private to be owned by the root user (David Steele) > > > This was backpatched to all supported versions[1]. While I'm a huge fan > > of this behavior change for a plethora of reasons, I'm not sure if this > > should be included as part of the PG15 release notes, given it's in the > > 14.3 et al. > > It should not. However, the backpatch happened later than the commit > to HEAD, and our git_changelog tool is not smart enough to match them > up, so Bruce didn't see that there were followup commits. That's a > generic hazard for major-release notes; if you spot any other cases > please do mention them. Yes, known problem. :-( -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
> On May 10, 2022, at 6:46 PM, Bruce Momjian wrote: > > Allow logical replication to run as the owner of the publication Make that "owner of the subscription". This change operates on the subscriber-side. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 08:28:54PM -0500, Justin Pryzby wrote: > On Tue, May 10, 2022 at 04:18:10PM -0400, Bruce Momjian wrote: > > > | Add the LZ4 compression method to pg_receivewal (Georgios Kokolatos) > > > | This is enabled via --compression-method=lz4 and requires binaries to > > > be built using --with-lz4. > > > | Redesign pg_receivewal's compression options (Georgios Kokolatos) > > > | The new --compression-method option controls the type of compression, > > > rather than just relying on --compress. > > > > > > It's --compress since 042a923ad. > > > > Yep, fixed. > > It now says: > > | The new --compression option controls the type of compression, rather than > just relying on --compress. > > But the option is --compress, and not --compression. Agreed, fixed. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 06:49:48PM -0700, Mark Dilger wrote: > > > > On May 10, 2022, at 6:46 PM, Bruce Momjian wrote: > > > > Allow logical replication to run as the owner of the publication > > Make that "owner of the subscription". This change operates on the > subscriber-side. Thanks, done. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > On Wed, 11 May 2022 at 03:44, Bruce Momjian wrote: > > > > I have completed the first draft of the PG 15 release notes and you can > > see the results here: > > > > https://momjian.us/pgsql_docs/release-15.html > > Thanks for doing that tedious work. > > I think the sort improvements done in v15 are worth a mention under > General Performance. The commits for this were 91e9e89dc, 40af10b57 > and 697492434. I've been running a few benchmarks between v14 and v15 > over the past few days and a fairly average case speedup is about 25%. > but there are cases where I've seen up to 400%. I think the increase > is to an extent that we maybe should have considered making tweaks in > cost_tuplesort(). I saw some plans that ran in about 60% of the time > by disabling Hash Agg and allowing Sort / Group Agg to do the work. Good point. Do you have any suggested text? I can't really see it clearly based on the commits, except "sorting is faster". -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: typos
I found a bunch more typos; a couple from codespell, and several which are the result of looking for previously-reported typos, like: time git log origin --grep '[tT]ypo' --word-diff -U1 |grep -Eo '\[-[[:lower:]]+-\]' |sed 's/^\[-//; s/-\]$//' |sort -u |grep -Fxvwf /usr/share/dict/words >badwords.txt time grep -rhoFwf badwords.txt doc |sort -u >not-badwords.txt time grep -Fxvwf not-badwords.txt ./badwords.txt >./badwords.txt.new time grep -rhoIFwf ./badwords.txt.new src --incl='*.[chly]' --incl='*.p[lm]' |sort |uniq -c |sort -nr |less >From 2166aa51840094f4b738f4a9af1ad6bd94a97cff Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Mon, 18 Apr 2022 10:13:09 -0500 Subject: [PATCH v2022051001 1/2] doc: ANDed and ORed --- doc/src/sgml/indexam.sgml | 2 +- doc/src/sgml/indices.sgml | 4 ++-- doc/src/sgml/logical-replication.sgml | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index d4163c96e9f..16669f4086a 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -843,7 +843,7 @@ amparallelrescan (IndexScanDesc scan); constant, where the index key is one of the columns of the index and the operator is one of the members of the operator family associated with that index column. An index scan has zero or more scan - keys, which are implicitly ANDed — the returned tuples are expected + keys, which are implicitly AND-ed — the returned tuples are expected to satisfy all the indicated conditions. diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index 023157d8884..890c6d3451c 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -597,7 +597,7 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST); a query like WHERE x = 42 OR x = 47 OR x = 53 OR x = 99 could be broken down into four separate scans of an index on x, each scan using one of the query clauses. The results of these scans are - then ORed together to produce the result. Another example is that if we + then OR-ed together to produce the result. Another example is that if we have separate indexes on x and y, one possible implementation of a query like WHERE x = 5 AND y = 6 is to use each index with the appropriate query clause and then AND together @@ -608,7 +608,7 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST); To combine multiple indexes, the system scans each needed index and prepares a bitmap in memory giving the locations of table rows that are reported as matching that index's conditions. - The bitmaps are then ANDed and ORed together as needed by the query. + The bitmaps are then AND-ed and OR-ed together as needed by the query. Finally, the actual table rows are visited and returned. The table rows are visited in physical order, because that is how the bitmap is laid out; this means that any ordering of the original indexes is lost, and diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 145ea71d61b..d2939fec71c 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -477,7 +477,7 @@ If the subscription has several publications in which the same table has been published with different row filters (for the same publish -operation), those expressions get ORed together, so that rows satisfying +operation), those expressions get OR-ed together, so that rows satisfying any of the expressions will be replicated. This means all the other row filters for the same table become redundant if: -- 2.17.1 >From a566141ebb01fc48cb766339553b600755ca4dca Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Tue, 10 May 2022 19:02:02 -0500 Subject: [PATCH v2022051001 2/2] typos --- contrib/citext/expected/citext.out | 2 +- contrib/citext/expected/citext_1.out| 2 +- contrib/citext/sql/citext.sql | 2 +- src/backend/executor/execGrouping.c | 2 +- src/backend/parser/parse_expr.c | 2 +- src/backend/replication/basebackup_target.c | 2 +- src/backend/utils/adt/int8.c| 2 +- src/bin/pg_basebackup/bbstreamer_tar.c | 2 +- src/bin/pg_rewind/t/007_standby_source.pl | 2 +- src/bin/pg_rewind/t/009_growing_files.pl| 2 +- src/test/regress/expected/psql.out | 2 +- src/test/regress/sql/psql.sql | 2 +- src/test/ssl/t/SSL/Backend/OpenSSL.pm | 2 +- 13 files changed, 13 insertions(+), 13 deletions(-) diff --git a/contrib/citext/expected/citext.out b/contrib/citext/expected/citext.out index 5afcc50920e..1c555981363 100644 --- a/contrib/citext/expected/citext.out +++ b/contrib/citext/expected/citext.out @@ -2270,7 +2270,7 @@ SELECT COUNT(*) = 8::bigint AS t FROM try; INSERT INTO try VALUES ( to_char( now()::timestamp, 'HH12:MI:SS') ), - ( to_char( now() + '1 se
Re: make MaxBackends available in _PG_init
On Tue, May 10, 2022 at 08:56:28AM -0700, Nathan Bossart wrote: > On Tue, May 10, 2022 at 05:55:12PM +0900, Michael Paquier wrote: >> I agree that removing support for the unloading part would be nice to >> clean up now on HEAD. Note that 0002 is missing the removal of one >> reference to _PG_fini in xfunc.sgml ( markup). > > Oops, sorry about that. 0002 looks pretty good from here. If it were me, I would apply 0002 first to avoid the extra tweak in pg_stat_statements with _PG_fini in 0001. Regarding 0001, I have spotted an extra issue in the documentation. xfunc.sgml has a section called "Shared Memory and LWLocks" about the use of RequestNamedLWLockTranche() and RequestAddinShmemSpace() called from _PG_init(), which would now be wrong as we'd need the hook. IMO, the docs should mention the registration of the hook in _PG_init(), the APIs involved, and should mention to look at pg_stat_statements.c for a code sample (we tend to avoid the addition of sample code in the docs lately as we habe no way to check their compilation automatically). Robert, are you planning to look at what's proposed and push these? FWIW, I am familiar enough with the problems at hand to do this in time for beta1 (aka by tomorrow to give it room in the buildfarm), but I don't want to step on your feet, either. -- Michael signature.asc Description: PGP signature
Re: First draft of the PG 15 release notes
On Wed, 11 May 2022 at 14:02, Bruce Momjian wrote: > > On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > > I think the sort improvements done in v15 are worth a mention under > > General Performance. The commits for this were 91e9e89dc, 40af10b57 > > and 697492434. I've been running a few benchmarks between v14 and v15 > > over the past few days and a fairly average case speedup is about 25%. > > but there are cases where I've seen up to 400%. I think the increase > > is to an extent that we maybe should have considered making tweaks in > > cost_tuplesort(). I saw some plans that ran in about 60% of the time > > by disabling Hash Agg and allowing Sort / Group Agg to do the work. > > Good point. Do you have any suggested text? I can't really see it > clearly based on the commits, except "sorting is faster". If we're going to lump those into a single line then maybe something along the lines of: * Reduce memory consumption and improve performance of sorting tuples in memory I think one line is fine from a user's perspective, but it's slightly harder to know the order of the names in the credits given the 3 independent commits. David
RE: Data is copied twice when specifying both child and parent table in publication
On Monday, May 9, 2022 10:51 AM wangw.f...@fujitsu.com wrote: > Attach new patches. > The patch for HEAD: > 1. Modify the approach. Enhance the API of function > pg_get_publication_tables to handle one publication or an array of > publications. > The patch for REL14: > 1. Improve the table sync SQL. [suggestions by Shi yu] Hi, thank you for updating the patch ! Minor comments on your patch for HEAD v2. (1) commit message sentence I suggest below sentence. Kindly change from "... when subscribing to both publications using one subscription, the data is replicated twice in inital copy" to "subscribing to both publications from one subscription causes initial copy twice". (2) unused variable pg_publication.c: In function ‘pg_get_publication_tables’: pg_publication.c:1091:11: warning: unused variable ‘pubname’ [-Wunused-variable] char*pubname; We can remove this. (3) free of allocated memory In the pg_get_publication_tables(), we don't free 'elems'. Don't we need it ? (4) some coding alignments 4-1. + List *tables_viaroot = NIL, ... + *current_table = NIL; I suggest we can put some variables into the condition for the first time call of this function, like tables_viaroot and current_table. When you agree, kindly change it. 4-2. + /* +* Publications support partitioned tables, although all changes +* are replicated using leaf partition identity and schema, so we +* only need those. +*/ + if (publication->alltables) + { + current_table = GetAllTablesPublicationRelations(publication->pubviaroot); + } This is not related to the change itself and now we are inheriting the previous curly brackets, but I think there's no harm in removing it, since it's only for one statement. Best Regards, Takamichi Osumi
Re: First draft of the PG 15 release notes (sorting)
On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > I think the sort improvements done in v15 are worth a mention under > General Performance. The commits for this were 91e9e89dc, 40af10b57 > and 697492434. I've been running a few benchmarks between v14 and v15 > over the past few days and a fairly average case speedup is about 25%. > but there are cases where I've seen up to 400%. I think the increase > is to an extent that we maybe should have considered making tweaks in > cost_tuplesort(). I saw some plans that ran in about 60% of the time > by disabling Hash Agg and allowing Sort / Group Agg to do the work. Is there any reason not to consider it now ? Either for v15 or v15+1. I wonder if this is also relevant. 65014000b35 Replace polyphase merge algorithm with a simple balanced k-way merge. -- Justin
Re: First draft of the PG 15 release notes
On 5/10/22 9:48 PM, Bruce Momjian wrote: On Tue, May 10, 2022 at 09:16:33PM -0400, Tom Lane wrote: "Jonathan S. Katz" writes: On 5/10/22 4:18 PM, Bruce Momjian wrote: | Allow libpq's SSL private to be owned by the root user (David Steele) This was backpatched to all supported versions[1]. While I'm a huge fan of this behavior change for a plethora of reasons, I'm not sure if this should be included as part of the PG15 release notes, given it's in the 14.3 et al. It should not. However, the backpatch happened later than the commit to HEAD, and our git_changelog tool is not smart enough to match them up, so Bruce didn't see that there were followup commits. That's a generic hazard for major-release notes; if you spot any other cases please do mention them. Yes, known problem. :-( Got it. I did a scan of the release notes and diff'd with the 14.[1-3] and did not find any additional overlap. Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: First draft of the PG 15 release notes
On 5/10/22 10:31 PM, David Rowley wrote: On Wed, 11 May 2022 at 14:02, Bruce Momjian wrote: On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: I think the sort improvements done in v15 are worth a mention under General Performance. The commits for this were 91e9e89dc, 40af10b57 and 697492434. I've been running a few benchmarks between v14 and v15 over the past few days and a fairly average case speedup is about 25%. but there are cases where I've seen up to 400%. I think the increase is to an extent that we maybe should have considered making tweaks in cost_tuplesort(). I saw some plans that ran in about 60% of the time by disabling Hash Agg and allowing Sort / Group Agg to do the work. Good point. Do you have any suggested text? I can't really see it clearly based on the commits, except "sorting is faster". If we're going to lump those into a single line then maybe something along the lines of: * Reduce memory consumption and improve performance of sorting tuples in memory I think one line is fine from a user's perspective, but it's slightly harder to know the order of the names in the credits given the 3 independent commits. I think a brief description following the one-liner would be useful for the release notes. If you can share a few more details about the benchmarks, we can expand on the one-liner in the release announcement (as this sounds like one of those cool, buzz-y things :) Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: First draft of the PG 15 release notes (sorting)
On Wed, 11 May 2022 at 14:38, Justin Pryzby wrote: > > On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > > I think the sort improvements done in v15 are worth a mention under > > General Performance. The commits for this were 91e9e89dc, 40af10b57 > > and 697492434. I've been running a few benchmarks between v14 and v15 > > over the past few days and a fairly average case speedup is about 25%. > > but there are cases where I've seen up to 400%. I think the increase > > is to an extent that we maybe should have considered making tweaks in > > cost_tuplesort(). I saw some plans that ran in about 60% of the time > > by disabling Hash Agg and allowing Sort / Group Agg to do the work. > > Is there any reason not to consider it now ? Either for v15 or v15+1. If the changes done had resulted in a change to the number of expected operations as far as big-O notation goes, then I think we might be able to do something. However, nothing changed in the number of operations. We only sped up the constant factors. If it were possible to adjust those constant factors based on some performance benchmarks results that were spat out by some single machine somewhere, then maybe we could do some tweaks. The problem is that to know that we're actually making some meaningful improvements to the costs, we'd want to get the opinion of >1 machine and likely >1 CPU architecture. That feels like something that would be much better to do during a release cycle rather than at this very late hour. The majority of my benchmarks were on AMD zen2 hardware. That's likely not going to reflect well on what the average hardware is that runs PostgreSQL. Also, I've no idea at this stage what we'd even do to cost_tuplesort(). The nruns calculation is a bit fuzzy and never really took the power-of-2 wastage that 40af10b57 reduces. Maybe there's some argument for adjusting the 2.0 constant in compute_cpu_sort_cost() based on what's done in 697492434. But there's plenty of datatypes that don't use the new sort specialization functions. Would we really want to add extra code to the planner to get it to try and figure that out? David
Re: Column Filtering in Logical Replication
On Tue, May 10, 2022 at 7:25 PM Jonathan S. Katz wrote: > > On 4/19/22 12:53 AM, Amit Kapila wrote: > > On Tue, Apr 19, 2022 at 6:58 AM Masahiko Sawada > > wrote: > >> > >> +1. I also think it's a bug so back-patching makes sense to me. > >> > > > > Pushed. Thanks Tomas and Sawada-San. > > This is still on the PG15 open items list[1] though marked as with a fix. > > Did dd4ab6fd resolve the issue, or does this need more work? > The commit dd4ab6fd resolved this issue. I didn't notice it after that commit. -- With Regards, Amit Kapila.
Re: bogus: logical replication rows/cols combinations
On Wed, May 11, 2022 at 12:35 AM Tomas Vondra wrote: > > On 5/9/22 05:45, Amit Kapila wrote: > > On Sun, May 8, 2022 at 11:41 PM Tomas Vondra > > wrote: > >> > >> On 5/7/22 07:36, Amit Kapila wrote: > >>> On Mon, May 2, 2022 at 6:11 PM Alvaro Herrera > >>> wrote: > > On 2022-May-02, Amit Kapila wrote: > > > > I think it is possible to expose a list of publications for each > > walsender as it is stored in each walsenders > > LogicalDecodingContext->output_plugin_private. AFAIK, each walsender > > can have one such LogicalDecodingContext and we can probably share it > > via shared memory? > > I guess we need to create a DSM each time a walsender opens a > connection, at START_REPLICATION time. Then ALTER PUBLICATION needs to > connect to all DSMs of all running walsenders and see if they are > reading from it. Is that what you have in mind? Alternatively, we > could have one DSM per publication with a PID array of all walsenders > that are sending it (each walsender needs to add its PID as it starts). > The latter might be better. > > >>> > >>> While thinking about using DSM here, I came across one of your commits > >>> f2f9fcb303 which seems to indicate that it is not a good idea to rely > >>> on it but I think you have changed dynamic shared memory to fixed > >>> shared memory usage because that was more suitable rather than DSM is > >>> not portable. Because I see a commit bcbd940806 where we have removed > >>> the 'none' option of dynamic_shared_memory_type. So, I think it should > >>> be okay to use DSM in this context. What do you think? > >>> > >> > >> Why would any of this be needed? > >> > >> ALTER PUBLICATION will invalidate the RelationSyncEntry entries in all > >> walsenders, no? So AFAICS it should be enough to enforce the limitations > >> in get_rel_sync_entry, > >> > > > > Yes, that should be sufficient to enforce limitations in > > get_rel_sync_entry() but it will lead to the following behavior: > > a. The Alter Publication command will be successful but later in the > > logs, the error will be logged and the user needs to check it and take > > appropriate action. Till that time the walsender will be in an error > > loop which means it will restart and again lead to the same error till > > the user takes some action. > > b. As we use historic snapshots, so even after the user takes action > > say by changing publication, it won't be reflected. So, the option for > > the user would be to drop their subscription. > > > > Am, I missing something? If not, then are we okay with such behavior? > > If yes, then I think it would be much easier implementation-wise and > > probably advisable at this point. We can document it so that users are > > careful and can take necessary action if they get into such a > > situation. Any way we can improve this in future as you also suggested > > earlier. > > > >> which is necessary anyway because the subscriber > >> may not be connected when ALTER PUBLICATION gets executed. > >> > > > > If we are not okay with the resultant behavior of detecting this in > > get_rel_sync_entry(), then we can solve this in some other way as > > Alvaro has indicated in one of his responses which is to detect that > > at start replication time probably in the subscriber-side. > > > > IMO that behavior is acceptable. > Fair enough, then we should go with a simpler approach to detect it in pgoutput.c (get_rel_sync_entry). > We have to do that check anyway, and > the subscription may start failing after ALTER PUBLICATION for a number > of other reasons anyway so the user needs/should check the logs. > I think ALTER PUBLICATION won't ever lead to failure in walsender. Sure, users can do something due to which subscriber-side failures can happen due to constraint failures. Do you have some specific cases in mind? > And if needed, we can improve this and start doing the proactive-checks > during ALTER PUBLICATION too. > Agreed. -- With Regards, Amit Kapila.
Re: First draft of the PG 15 release notes
On Tue, May 10, 2022 at 08:31:17PM -0500, Justin Pryzby wrote: > | Remove incorrect duplicate partition tables in system view > pg_publication_tables (Hou Zhijie) > > should say "partitions" ? > "Do not show partitions whose parents are also published" (is that accurate?) I went with: Remove incorrect duplicate partitions in system view pg_publication_tables (Hou Zhijie) > | Allow system and TOAST B-tree indexes to efficiently store duplicates > (Peter Geoghegan) > | Previously de-duplication was disabled for these types of indexes. > > I think the user-facing change here is that (in addition to being "allowed"), > it's now enabled by default for catalog indexes. "Enable de-duplication of > system indexes by default". I went with: Enable system and TOAST B-tree indexes to efficiently store duplicates (Peter Geoghegan) > | Prevent changes to columns only indexed by BRIN indexes from preventing HOT > updates (Josef Simanek) > > says "prevent" twice. > "Allow HOT updates when changed columns are only indexed by BRIN indexes" > (or "avoid precluding...") I went with: Prevent changes to columns only indexed by BRIN indexes from disabling HOT updates (Josef Simanek) > | Improve the performance of window functions that use row_number(), rank(), > and count() (David Rowley) > > The essential feature is a new kind of "prosupport", which is implemented for > those core functions. I suggest to add another sentence about how prosupport > can also be added to user-defined/non-core functions. Uh, I don't see how "prosupport" would be relevant for users to know about. > | Store server-level statistics in shared memory (Kyotaro Horiguchi, Andres > Freund, Melanie Plageman) > > Should this be called "cumulative" statistics? As in > b3abca68106d518ce5d3c0d9a1e0ec02a647ceda. Uh, they are counters, which I guess is cummulative, but that doesn't seem very descriptive. The documentation call it the statistics collector, but I am not sure we even have that anymore with an in-memory implementation. I am kind of not sure what to call it. > | Allows view access to be controlled by privileges of the view user > (Christoph Heiss) > > Allow Fixed. > | New function > > "The new function .." (a few times) Uh, I only see it once. > | Improve the parallel pg_dump performance of TOAST tables (Tom Lane) > > I don't think this needs to be mentioned, unless maybe folded into an entry > like "improve performance when dumping with many objects or relations with > large toast tables". I mentioned it because I thought users who tried parallelism might find it is faster now so they should re-test it, no? > | Allow pg_basebackup to decompress LZ4 and Zstandard compressed server-side > base backups, and LZ4 and Zstandard compress output files (Dipesh Pandit, > Jeevan Ladhe) > > maybe: "... and to compress output files with LZ4 and Zstandard." Yes, I like that better, done. > | Add direct I/O support to macOS (Thomas Munro) > | This only works if max_wal_senders=0 and wal_level=minimal. > > I think this should mention that it's only for WAL. Agreed, done. > | Remove status reporting during pg_upgrade operation if the output is not a > terminal (Andres Freund) > > Maybe: "By default, do not output status information unless the output is a > terminal" I went with: Disable default status reporting during pg_upgrade operation if the output is not a terminal (Andres Freund) > | Add new protocol message COMPRESSION and COMPRESSION_DETAIL to specify the > compression method and level (Robert Haas) > > s/level/options/ ? Ah, yes, this changed to be more generic than level, done. > | Prevent DROP DATABASE, DROP TABLESPACE, and ALTER DATABASE SET TABLESPACE > from occasionally failing during concurrent use on Windows (Thomas Munro) > > Maybe this doesn't need to be mentioned ? Uh, the previous behavior seems pretty bad so I wanted to mention it will not happen anymore. > | Fix pg_statio_all_tables to sum values for the rare case of TOAST tables > with multiple indexes (Andrei Zubkov) > | Previously such cases would have one row for each index. > > Doesn't need to be mentioned ? > It doesn't seem like a "compatibility" issue anyway. Uh, there were certain cases where multiple indexes happened and I think we need to tell people it is no longer a problem to work around, no? > Should this be included? > 6b94e7a6da2 Consider fractional paths in generate_orderedappend_paths I looked at that but didn't see how it would be relevent for users. Do you have a suggestion for text? > Should any of these be listed as incompatible changes (some of these I asked > before, but the others are from another list). > > 95ab1e0a9db interval: round values when spilling to months Yes, moved already. > 9cd28c2e5f1 Remove server support for old BASE_BACKUP command syntax. Seems internal-only so moved to Source Code. > 0d4513b6138 Remove server s
Re: Perform streaming logical transactions by background workers and parallel apply
On Tue, May 10, 2022 at 6:10 PM Amit Kapila wrote: > > On Tue, May 10, 2022 at 10:35 AM Masahiko Sawada > wrote: > > > > On Wed, May 4, 2022 at 12:50 PM Amit Kapila wrote: > > > > > > On Tue, May 3, 2022 at 9:45 AM Amit Kapila > > > wrote: > > > > > > > > On Mon, May 2, 2022 at 5:06 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > On Mon, May 2, 2022 at 6:09 PM Amit Kapila > > > > > wrote: > > > > > > > > > > > > On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Are you planning to support "Transaction dependency" Amit > > > > > > > mentioned in > > > > > > > his first mail in this patch? IIUC since the background apply > > > > > > > worker > > > > > > > applies the streamed changes as soon as receiving them from the > > > > > > > main > > > > > > > apply worker, a conflict that doesn't happen in the current > > > > > > > streaming > > > > > > > logical replication could happen. > > > > > > > > > > > > > > > > > > > This patch seems to be waiting for stream_stop to finish, so I don't > > > > > > see how the issues related to "Transaction dependency" can arise? > > > > > > What > > > > > > type of conflict/issues you have in mind? > > > > > > > > > > Suppose we set both publisher and subscriber: > > > > > > > > > > On publisher: > > > > > create table test (i int); > > > > > insert into test values (0); > > > > > create publication test_pub for table test; > > > > > > > > > > On subscriber: > > > > > create table test (i int primary key); > > > > > create subscription test_sub connection '...' publication test_pub; -- > > > > > value 0 is replicated via initial sync > > > > > > > > > > Now, both 'test' tables have value 0. > > > > > > > > > > And suppose two concurrent transactions are executed on the publisher > > > > > in following order: > > > > > > > > > > TX-1: > > > > > begin; > > > > > insert into test select generate_series(0, 1); -- changes will be > > > > > streamed; > > > > > > > > > > TX-2: > > > > > begin; > > > > > delete from test where c = 0; > > > > > commit; > > > > > > > > > > TX-1: > > > > > commit; > > > > > > > > > > With the current streaming logical replication, these changes will be > > > > > applied successfully since the deletion is applied before the > > > > > (streamed) insertion. Whereas with the apply bgworker, it fails due to > > > > > an unique constraint violation since the insertion is applied first. > > > > > I've confirmed that it happens with v5 patch. > > > > > > > > > > > > > Good point but I am not completely sure if doing transaction > > > > dependency tracking for such cases is really worth it. I feel for such > > > > concurrent cases users can anyway now also get conflicts, it is just a > > > > matter of timing. One more thing to check transaction dependency, we > > > > might need to spill the data for streaming transactions in which case > > > > we might lose all the benefits of doing it via a background worker. Do > > > > we see any simple way to avoid this? > > > > > > > > I agree that it is just a matter of timing. I think new issues that > > haven't happened on the current streaming logical replication > > depending on the timing could happen with this feature and vice versa. > > > > Here by vice versa, do you mean some problems that can happen with > current code won't happen after new implementation? If so, can you > give one such example? > > > > > > > I think the other kind of problem that can happen here is delete > > > followed by an insert. If in the example provided by you, TX-1 > > > performs delete (say it is large enough to cause streaming) and TX-2 > > > performs insert then I think it will block the apply worker because > > > insert will start waiting infinitely. Currently, I think it will lead > > > to conflict due to insert but that is still solvable by allowing users > > > to remove conflicting rows. > > > > > > It seems both these problems are due to the reason that the table on > > > publisher and subscriber has different constraints otherwise, we would > > > have seen the same behavior on the publisher as well. > > > > > > There could be a few ways to avoid these and similar problems: > > > a. detect the difference in constraints between publisher and > > > subscribers like primary key and probably others (like whether there > > > is any volatile function present in index expression) when applying > > > the change and then we give ERROR to the user that she must change the > > > streaming mode to 'spill' instead of 'apply' (aka parallel apply). > > > b. Same as (a) but instead of ERROR just LOG this information and > > > change the mode to spill for the transactions that operate on that > > > particular relation. > > > > Given that it doesn't introduce a new kind of problem I don't think we > > need special treatment for that at least in this feature. > > > > Isn't the problem related to infinite wait by insert as explained in > my previous
Re: strange slow query - lost lot of time somewhere
On Tue, 10 May 2022 at 14:22, Justin Pryzby wrote: > On Fri, May 06, 2022 at 09:27:57PM +1200, David Rowley wrote: > > I'm very tempted to change the EXPLAIN output in at least master to > > display the initial and final (maximum) hash table sizes. Wondering if > > anyone would object to that? > > No objection to add it to v15. > > I'll point out that "Cache Mode" was added to EXPLAIN between 11.1 and 11.2 > without controversy, so this could conceivably be backpatched to v14, too. > > commit 6c32c0977783fae217b5eaa1d22d26c96e5b0085 This is seemingly a good point, but I don't really think it's a case of just keeping the EXPLAIN output stable in minor versions, it's more about adding new fields to structs. I just went and wrote the patch and the fundamental difference seems to be that what I did in 6c32c0977 managed to only add a new field in the empty padding between two fields. That resulted in no fields in the struct being pushed up in their address offset. The idea here is not to break any extension that's already been compiled that references some field that comes after that. In the patch I've just written, I've had to add some fields which causes sizeof(MemoizeState) to go up resulting in the offsets of some later fields changing. One thing I'll say about this patch is that I found it annoying that I had to add code to cache_lookup() when we failed to find an entry. That's probably not the end of the world as that's only for cache misses. Ideally, I'd just be looking at the size of the hash table at the end of execution, however, naturally, we must show the EXPLAIN output before we shut down the executor. I just copied the Hash Join output. It looks like: # alter table tt alter column a set (n_distinct=4); ALTER TABLE # analyze tt; # explain (analyze, costs off, timing off) select * from tt inner join t2 on tt.a=t2.a; QUERY PLAN - Nested Loop (actual rows=100 loops=1) -> Seq Scan on tt (actual rows=100 loops=1) -> Memoize (actual rows=1 loops=100) Cache Key: tt.a Cache Mode: logical Hits: 90 Misses: 10 Evictions: 0 Overflows: 0 Memory Usage: 2kB Hash Buckets: 16 (originally 4) -> Index Only Scan using t2_pkey on t2 (actual rows=1 loops=10) Index Cond: (a = tt.a) Heap Fetches: 0 Planning Time: 0.483 ms Execution Time: 862.860 ms (12 rows) Does anyone have any views about the attached patch going into v15? David diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index d2a2479822..c535d63883 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -3177,6 +3177,11 @@ show_memoize_info(MemoizeState *mstate, List *ancestors, ExplainState *es) ExplainPropertyInteger("Cache Evictions", NULL, mstate->stats.cache_evictions, es); ExplainPropertyInteger("Cache Overflows", NULL, mstate->stats.cache_overflows, es); ExplainPropertyInteger("Peak Memory Usage", "kB", memPeakKb, es); + ExplainPropertyInteger("Maximum Hash Buckets", NULL, + mstate->stats.hashsize_max, es); + ExplainPropertyInteger("Original Hash Buckets", NULL, + mstate->stats.hashsize_orig, es); + } else { @@ -3188,6 +3193,14 @@ show_memoize_info(MemoizeState *mstate, List *ancestors, ExplainState *es) mstate->stats.cache_evictions, mstate->stats.cache_overflows, memPeakKb); + ExplainIndentText(es); + if (mstate->stats.hashsize_max != mstate->stats.hashsize_orig) + appendStringInfo(es->str, "Hash Buckets: %u (originally %u)\n", + mstate->stats.hashsize_max, + mstate->stats.hashsize_orig); + else + appendStringInfo(es->str, "Hash Buckets: %u\n", + mstate->stats.hashsize_max); } } @@ -3227,6 +3240,14 @@ show_memoize_info(MemoizeState *mstate, List *ancestors, ExplainState *es) si->cache_hits, si->cache_misses, si->cache_evictions, si->cache_overflows, memPeakKb); + ExplainIndentText(es); + if (si->hashsize_max != si->hashsize_
Re: First draft of the PG 15 release notes
On Wed, 11 May 2022 at 15:41, Bruce Momjian wrote: > > On Tue, May 10, 2022 at 08:31:17PM -0500, Justin Pryzby wrote: > > | Improve the performance of window functions that use row_number(), > > rank(), and count() (David Rowley) > > > > The essential feature is a new kind of "prosupport", which is implemented > > for > > those core functions. I suggest to add another sentence about how prosupport > > can also be added to user-defined/non-core functions. > > Uh, I don't see how "prosupport" would be relevant for users to know > about. I'd say if it's not mentioned in our documentation then we shouldn't put it in the release notes. Currently, it's only documented in the source code. If someone felt strongly that something should be written in the documents about this then I'd say only then should we consider mentioning it in the release notes. David
Re: Perform streaming logical transactions by background workers and parallel apply
On Tue, May 10, 2022 at 5:59 PM Amit Kapila wrote: > > On Tue, May 10, 2022 at 10:39 AM Masahiko Sawada > wrote: > > > > On Wed, May 4, 2022 at 8:44 AM Peter Smith wrote: > > > > > > On Tue, May 3, 2022 at 5:16 PM Peter Smith wrote: > > > > > > > ... > > > > > > > Avoiding unexpected differences like this is why I suggested the > > > > option should have to be explicitly enabled instead of being on by > > > > default as it is in the current patch. See my review comment #14 [1]. > > > > It means the user won't have to change their existing code as a > > > > workaround. > > > > > > > > -- > > > > [1] > > > > https://www.postgresql.org/message-id/CAHut%2BPuqYP5eD5wcSCtk%3Da6KuMjat2UCzqyGoE7sieCaBsVskQ%40mail.gmail.com > > > > > > > > > > Sorry I was wrong above. It seems this behaviour was already changed > > > in the latest patch v5 so now the option value 'on' means what it > > > always did. Thanks! > > > > Having it optional seems a good idea. BTW can the user configure how > > many apply bgworkers can be used per subscription or in the whole > > system? Like max_sync_workers_per_subscription, is it better to have a > > configuration parameter or a subscription option for that? If so, > > setting it to 0 probably means to disable the parallel apply feature. > > > > Yeah, that might be useful but we are already giving an option while > creating a subscription whether to allow parallelism, so will it be > useful to give one more way to disable this feature? OTOH, having > something like max_parallel_apply_workers/max_bg_apply_workers at the > system level can give better control for how much parallelism the user > wishes to allow for apply work. Or we can have something like max_parallel_apply_workers_per_subscription that controls how many parallel apply workers can launch per subscription. That also gives better control for the number of parallel apply workers. > If we have such a new parameter then I > think max_logical_replication_workers should include apply workers, > parallel apply workers, and table synchronization? Agreed. > In such a case, > don't we need to think of increasing the default value of > max_logical_replication_workers? I think we would need to think about that if the parallel apply is enabled by default but given that it's disabled by default I'm fine with the current default value. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: make MaxBackends available in _PG_init
On Wed, May 11, 2022 at 11:18:29AM +0900, Michael Paquier wrote: > 0002 looks pretty good from here. If it were me, I would apply 0002 > first to avoid the extra tweak in pg_stat_statements with _PG_fini in > 0001. I swapped 0001 and 0002 in v8. > Regarding 0001, I have spotted an extra issue in the documentation. > xfunc.sgml has a section called "Shared Memory and LWLocks" about the > use of RequestNamedLWLockTranche() and RequestAddinShmemSpace() called > from _PG_init(), which would now be wrong as we'd need the hook. IMO, > the docs should mention the registration of the hook in _PG_init(), > the APIs involved, and should mention to look at pg_stat_statements.c > for a code sample (we tend to avoid the addition of sample code in the > docs lately as we habe no way to check their compilation > automatically). Ah, good catch. I adjusted the docs as you suggested. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 14b5c65fc03255e0a2e733c4b93a2d16d5b1303d Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 6 May 2022 13:53:54 -0700 Subject: [PATCH v8 1/2] Remove logic for unloading dynamically loaded files. The code for unloading a library has been commented-out for over 12 years (602a9ef), and we're no closer to supporting it now than we were back then. --- contrib/auto_explain/auto_explain.c | 14 contrib/passwordcheck/passwordcheck.c | 11 --- .../pg_stat_statements/pg_stat_statements.c | 18 - doc/src/sgml/xfunc.sgml | 18 + src/backend/postmaster/pgarch.c | 2 +- src/backend/utils/fmgr/dfmgr.c| 77 ++- src/backend/utils/fmgr/fmgr.c | 14 src/include/fmgr.h| 1 - src/pl/plpgsql/src/plpgsql.h | 2 - .../modules/delay_execution/delay_execution.c | 10 +-- .../ssl_passphrase_func.c | 7 -- .../modules/test_oat_hooks/test_oat_hooks.c | 22 +- .../modules/test_rls_hooks/test_rls_hooks.c | 17 13 files changed, 13 insertions(+), 200 deletions(-) diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index d3029f85ef..c9a0d947c8 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -77,7 +77,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL; static ExecutorEnd_hook_type prev_ExecutorEnd = NULL; void _PG_init(void); -void _PG_fini(void); static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags); static void explain_ExecutorRun(QueryDesc *queryDesc, @@ -244,19 +243,6 @@ _PG_init(void) ExecutorEnd_hook = explain_ExecutorEnd; } -/* - * Module unload callback - */ -void -_PG_fini(void) -{ - /* Uninstall hooks. */ - ExecutorStart_hook = prev_ExecutorStart; - ExecutorRun_hook = prev_ExecutorRun; - ExecutorFinish_hook = prev_ExecutorFinish; - ExecutorEnd_hook = prev_ExecutorEnd; -} - /* * ExecutorStart hook: start up logging if needed */ diff --git a/contrib/passwordcheck/passwordcheck.c b/contrib/passwordcheck/passwordcheck.c index 074836336d..9d8c58ded0 100644 --- a/contrib/passwordcheck/passwordcheck.c +++ b/contrib/passwordcheck/passwordcheck.c @@ -33,7 +33,6 @@ static check_password_hook_type prev_check_password_hook = NULL; #define MIN_PWD_LENGTH 8 extern void _PG_init(void); -extern void _PG_fini(void); /* * check_password @@ -149,13 +148,3 @@ _PG_init(void) prev_check_password_hook = check_password_hook; check_password_hook = check_password; } - -/* - * Module unload function - */ -void -_PG_fini(void) -{ - /* uninstall hook */ - check_password_hook = prev_check_password_hook; -} diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index df2ce63790..ceaad81a43 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -305,7 +305,6 @@ static bool pgss_save; /* whether to save stats across shutdown */ /* Function declarations */ void _PG_init(void); -void _PG_fini(void); PG_FUNCTION_INFO_V1(pg_stat_statements_reset); PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_7); @@ -481,23 +480,6 @@ _PG_init(void) ProcessUtility_hook = pgss_ProcessUtility; } -/* - * Module unload callback - */ -void -_PG_fini(void) -{ - /* Uninstall hooks. */ - shmem_startup_hook = prev_shmem_startup_hook; - post_parse_analyze_hook = prev_post_parse_analyze_hook; - planner_hook = prev_planner_hook; - ExecutorStart_hook = prev_ExecutorStart; - ExecutorRun_hook = prev_ExecutorRun; - ExecutorFinish_hook = prev_ExecutorFinish; - ExecutorEnd_hook = prev_ExecutorEnd; - ProcessUtility_hook = prev_ProcessUtility; -} - /* * shmem_startup hook: allocate or attach to shared memory, * then load any pre-existing statistics from file. diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 07c5fd198b..fbe718e3c2 100644 --- a/doc/src/
Re: Allowing REINDEX to have an optional name
On Tue, May 10, 2022 at 7:30 PM Simon Riggs wrote: > On Tue, 10 May 2022 at 14:47, Ashutosh Bapat > wrote: > > > > On Tue, May 10, 2022 at 2:43 PM Simon Riggs > > wrote: > > > > > > A minor issue, and patch. > > > > > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > > > dbname, which makes this a little less usable than we might like. > > > > > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > > > DATABASE not much use in practice, and is the reason there is no test > > > for REINDEX DATABASE. Another reason why it is a little less usable > > > than we might like. > > > > > > Seems we should do something about these historic issues in the name > > > of product usability. > > > > > > Attached patch allows new syntax for REINDEX DATABASE, without needing > > > to specify dbname. That version of the command skips catalog tables, > > > as a way of avoiding the known deadlocks. Patch also adds a test. > > > > > > > From the patch it looks like with the patch applied running REINDEX > > DATABASE is equivalent to running REINDEX DATABASE > > except reindexing the shared catalogs. Is that correct? > > Yes > > > Though the patch adds following change > > + Indexes on shared system catalogs are also processed, unless the > > + database name is omitted, in which case system catalog indexes > > are skipped. > > > > the syntax looks unintuitive. > > > > I think REINDEX DATABASE reindexing the current database is a good > > usability improvement in itself. But skipping the shared catalogs > > needs an explicity syntax. Not sure how feasible it is but something > > like REINDEX DATABASE skip SHARED/SYSTEM. > > There are two commands: > > REINDEX DATABASE does every except system catalogs > REINDEX SYSTEM does system catalogs only > IIUC REINDEX DATABASE does system catalogs as well REINDEX DATABASE does everything except system catalogs That's confusing and unintuitive. Not providing the database name leads to ignoring system catalogs. I won't expect that from this syntax. > So taken together, the two commands seem intuitive to me. > > It is designed like this because it is dangerous to REINDEX the system > catalogs because of potential deadlocks, so we want a way to avoid > that problem. > It's more clear if we add SKIP SYSTEM CATALOGS or some such explicit syntax. -- Best Wishes, Ashutosh
Re: Support logical replication of DDLs
On Wed, May 11, 2022 at 1:01 AM Zheng Li wrote: > > > > > I agree that it adds to our maintainability effort, like every time we > > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > > probably need to change the deparsing code. But OTOH this approach > > > > seems to provide better flexibility. So, in the long run, maybe the > > > > effort is worthwhile. I am not completely sure at this stage which > > > > approach is better but I thought it is worth evaluating this approach > > > > as Alvaro and Robert seem to prefer this idea. > > > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > > have to write the deparsing code for all the utility commands so there > > > will be a huge amount of new code to maintain. > > > > Actually, the largest stumbling block on this, IMO, is having a good > > mechanism to verify that all DDLs are supported. The deparsing code > > itself is typically not very bad, as long as we have a sure way to twist > > every contributor's hand into writing it, which is what an automated > > test mechanism would give us. > > > > The code in test_ddl_deparse is a pretty lame start, not nearly good > > enough by a thousand miles. My real intention was to have a test > > harness that would first run a special SQL script to install DDL > > capture, then run all the regular src/test/regress scripts, and then at > > the end ensure that all the DDL scripts were properly reproduced -- for > > example transmit them to another database, replay them there, and dump > > both databases and compare them. However, there were challenges which I > > no longer remember and we were unable to complete this, and we are where > > we are. > > > > Thanks for rebasing that old code, BTW. > > I agree that deparsing could be a very useful utility on its own. Not > only for SQL command replication between PostgreSQL servers, but also > potentially feasible for SQL command replication between PotgreSQL and > other database systems. For example, one could assemble the json > representation of the SQL parse tree back to a SQL command that can be > run in MySQL. But that requires different assembling rules and code > for different database systems. If we're envisioning this kind of > flexibility that the deparsing utility can offer, then I think it's > better to develop the deparsing utility as an extension itself. IIUC the event trigger can already provide such flexibility. That is, one could create an extension that creates an event trigger and in the event trigger function it deparses the SQL parse tree to a SQL command that can be run in MySQL. While having such flexibility, I’m fine with having built-in deparsing utility in the core. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, May 11, 2022 at 9:17 AM Masahiko Sawada wrote: > > On Tue, May 10, 2022 at 6:10 PM Amit Kapila wrote: > > > > On Tue, May 10, 2022 at 10:35 AM Masahiko Sawada > > wrote: > > > > > > On Wed, May 4, 2022 at 12:50 PM Amit Kapila > > > wrote: > > > > > > > > > > > > I think the other kind of problem that can happen here is delete > > > > followed by an insert. If in the example provided by you, TX-1 > > > > performs delete (say it is large enough to cause streaming) and TX-2 > > > > performs insert then I think it will block the apply worker because > > > > insert will start waiting infinitely. Currently, I think it will lead > > > > to conflict due to insert but that is still solvable by allowing users > > > > to remove conflicting rows. > > > > > > > > It seems both these problems are due to the reason that the table on > > > > publisher and subscriber has different constraints otherwise, we would > > > > have seen the same behavior on the publisher as well. > > > > > > > > There could be a few ways to avoid these and similar problems: > > > > a. detect the difference in constraints between publisher and > > > > subscribers like primary key and probably others (like whether there > > > > is any volatile function present in index expression) when applying > > > > the change and then we give ERROR to the user that she must change the > > > > streaming mode to 'spill' instead of 'apply' (aka parallel apply). > > > > b. Same as (a) but instead of ERROR just LOG this information and > > > > change the mode to spill for the transactions that operate on that > > > > particular relation. > > > > > > Given that it doesn't introduce a new kind of problem I don't think we > > > need special treatment for that at least in this feature. > > > > > > > Isn't the problem related to infinite wait by insert as explained in > > my previous email (in the above-quoted text) a new kind of problem > > that won't exist in the current implementation? > > > > Sorry I had completely missed the point that the commit order won't be > changed. I agree that this new implementation would introduce a new > kind of issue as you mentioned above, and the opposite is not true. > > Regarding the case you explained in the previous email I also think it > will happen with the parallel apply feature. The apply worker will be > blocked until the conflict is resolved. I'm not sure how to avoid > that. It would be not easy to compare constraints between publisher > and subscribers when replicating partitioning tables. > I agree that partitioned tables need some more thought but in some simple cases where replication happens via individual partition tables (default), we can detect as we do for normal tables. OTOH, when replication happens via root (publish_via_partition_root) it could be tricky as the partitions could be different on both sides. I think the cases where we can't safely identify the constraint difference won't be considered for apply via a new bg worker. -- With Regards, Amit Kapila.
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, May 11, 2022 at 9:35 AM Masahiko Sawada wrote: > > On Tue, May 10, 2022 at 5:59 PM Amit Kapila wrote: > > > > On Tue, May 10, 2022 at 10:39 AM Masahiko Sawada > > wrote: > > > > > > Having it optional seems a good idea. BTW can the user configure how > > > many apply bgworkers can be used per subscription or in the whole > > > system? Like max_sync_workers_per_subscription, is it better to have a > > > configuration parameter or a subscription option for that? If so, > > > setting it to 0 probably means to disable the parallel apply feature. > > > > > > > Yeah, that might be useful but we are already giving an option while > > creating a subscription whether to allow parallelism, so will it be > > useful to give one more way to disable this feature? OTOH, having > > something like max_parallel_apply_workers/max_bg_apply_workers at the > > system level can give better control for how much parallelism the user > > wishes to allow for apply work. > > Or we can have something like > max_parallel_apply_workers_per_subscription that controls how many > parallel apply workers can launch per subscription. That also gives > better control for the number of parallel apply workers. > I think we can go either way in this matter as both have their pros and cons. I feel limiting the parallel workers per subscription gives better control but OTOH, it may not allow max usage of parallelism because some quota from other subscriptions might remain unused. Let us see what Hou-San or others think on this matter? -- With Regards, Amit Kapila.
Re: Unstable tests for recovery conflict handling
On Thu, Apr 28, 2022 at 5:50 AM Mark Dilger wrote: > psql::1: ERROR: prepared transaction with identifier "xact_012_1" > does not exist > [10:26:16.314](1.215s) not ok 11 - Rollback of PGPROC_MAX_CACHED_SUBXIDS+ > prepared transaction on promoted standby > [10:26:16.314](0.000s) > [10:26:16.314](0.000s) # Failed test 'Rollback of > PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby' > [10:26:16.314](0.000s) # at t/012_subtransactions.pl line 208. > [10:26:16.314](0.000s) # got: '3' > # expected: '0' FWIW I see that on my FBSD/clang system when I build with -fsanitize=undefined -fno-sanitize-recover=all. It's something to do with our stack depth detection and tricks being used by -fsanitize, because there's a stack depth exceeded message in the log.
Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set
On Wed, May 11, 2022 at 10:29:44AM +0900, Michael Paquier wrote: > On Mon, May 09, 2022 at 12:18:39PM +0900, Michael Paquier wrote: > > All these fixes lead me to the attached patch. > > I have applied this stuff as of 7dd3ee5, in time for beta1, and closed > the open item. One difference is that I've added one backslash > surrounding the double quote at the beginning *and* the end of the > database name in the patch. However, the original case was different, > with: > - At the beginning of the database name, one backslash before and > after the double quote. > - At the end of the database name, two backslaces before the double > quote and three after the double quote. Why did you discontinue testing the longstanding test database name?
Re: Estimating HugePages Requirements?
On Tue, May 10, 2022 at 09:12:49AM -0700, Nathan Bossart wrote: > AFAICT you need to set log_min_messages to at least DEBUG3 to see extra > output for the non-runtime-computed GUCs, so it might not be worth the > added complexity. This set of messages is showing up for ages with zero complaints from the field AFAIK, and nobody would use this level of logging except developers. One thing that overriding log_min_messages to FATAL does, however, is to not show anymore those debug3 messages when querying a runtime-computed GUC, but that's the kind of things we'd hide. Your patch would hide those entries in both cases. Perhaps we could do that, but at the end, I don't really see any need to complicate this code path more than necessary, and this is enough to silence the logs in the cases we care about basically all the time, even if the log levels are reduced a bit on a given cluster. Hence, I have applied the simplest solution to just enforce a log_min_messages=FATAL when requesting a runtime GUC. -- Michael signature.asc Description: PGP signature
Re: Allowing REINDEX to have an optional name
On Wed, May 11, 2022 at 09:54:17AM +0530, Ashutosh Bapat wrote: > REINDEX DATABASE does system catalogs as well > REINDEX DATABASE does everything except system catalogs > > That's confusing and unintuitive. Agreed. Nobody is going to remember the difference. REINDEX's parsing grammar is designed to be extensible because we have the parenthesized flavor. Why don't you add an option there to skip the catalogs, like a SKIP_CATALOG? > Not providing the database name leads to ignoring system catalogs. I won't > expect that from this syntax. I don't disagree with having a shortened grammar where the database name is not required because one cannot reindex a database different than the one connected to, but changing a behavior based on such a grammar difference is not a good user experience. -- Michael signature.asc Description: PGP signature
Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:
On Tue, May 10, 2022 at 1:07 AM Robert Haas wrote: > On Sun, May 8, 2022 at 7:30 PM Thomas Munro wrote: > > LOG: still waiting for pid 1651417 to accept ProcSignalBarrier > > STATEMENT: alter database mydb set tablespace ts1; > This is a very good idea. OK, I pushed this, after making the ereport call look a bit more like others that talk about backend PIDs. > > Another thought is that it might be nice to be able to test with a > > dummy PSB that doesn't actually do anything. You could use it to see > > how fast your system processes it, while doing various other things, > > and to find/debug problems in other code that fails to handle > > interrupts correctly. Here's an attempt at that. I guess it could go > > into a src/test/modules/something instead of core, but on the other > > hand the PSB itself has to be in core anyway, so maybe not. Thoughts? > > No documentation yet, just seeing if people think this is worth > > having... better names/ideas welcome. > > I did this at one point, but I wasn't convinced it was going to find > enough bugs to be worth committing. It's OK if you're convinced of > things that didn't convince me, though. I'll leave this here for now.
Re: typos
On Tue, May 10, 2022 at 09:03:34PM -0500, Justin Pryzby wrote: > I found a bunch more typos; a couple from codespell, and several which are the > result of looking for previously-reported typos, like: Thanks, applied 0002. Regarding 0001, I don't really know which one of {AND,OR}ed or {AND,OR}-ed is better. Note that the code prefers the former, but your patch changes the docs to use the latter. -- Michael signature.asc Description: PGP signature
Re: Crash in new pgstats code
On Tue, Apr 19, 2022 at 08:31:05PM +1200, Thomas Munro wrote: > On Tue, Apr 19, 2022 at 2:50 AM Andres Freund wrote: > > Kestrel won't go that far back even - I set it up 23 days ago... > > Here's a ~6 month old example from mylodon (I can't see much further > back than that with HTTP requests... I guess BF records are purged?): > > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mylodon&dt=2021-10-19%2022%3A57%3A54&stg=recovery-check Do we have anything remaining on this thread in light of the upcoming beta1? One fix has been pushed upthread, but it does not seem we are completely in the clear either. -- Michael signature.asc Description: PGP signature
Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508
On Mon, Apr 25, 2022 at 03:18:52PM +0900, Michael Paquier wrote: > On Wed, Apr 20, 2022 at 01:03:20PM +0200, Erik Rijkers wrote: >> Yes, that seems to fix it: I applied that latter patch, and ran my program >> 250x without errors. Then I removed it again an it gave the error within >> 15x. > > That looks simple enough, indeed. Andres, are you planning to address > this issue? Ping. It looks annoying to release beta1 with that, as assertions are likely going to be enabled in a lot of test builds. -- Michael signature.asc Description: PGP signature