Re: ExecRTCheckPerms() and many prunable partitions
On 2022-Nov-29, Amit Langote wrote: > Maybe, we should have the following there so that the PlannedStmt's > contents don't point into the Query? > > newperminfo = copyObject(perminfo); Hmm, I suppose if we want a separate RTEPermissionInfo node, we should instead do GetRTEPermissionInfo(rte) followed by AddRTEPermissionInfo(newrte) and avoid the somewhat cowboy-ish coding there. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/
Re: Collation version tracking for macOS
On 11/28/22 6:54 PM, Jeff Davis wrote: > > =# select * from pg_icu_collation_versions('en_US') order by > icu_version; > icu_version | uca_version | collator_version > -+-+-- > ... > 67.1| 13.0| 153.14 > 68.2| 13.0| 153.14 > 69.1| 13.0| 153.14 > 70.1| 14.0| 153.112 > (21 rows) > > This is good information, because it tells us that major library > versions change more often than collation versions, empirically- > speaking. It seems to me that the collator_version field is not a good version identifier to use. Just taking a quick glance at the ICU home page right now, it shows that all of the last 5 versions of ICU have included "additions and corrections" to locale data itself, including 68 to 69 where the collator version did not change. Is it possible that this "collator_version" only reflects the code that processes collation data to do comparisons/sorts, but it does not reflect updates to the locale data itself? https://icu.unicode.org/ ICU v72 -> CLDR v42 ICU v71 -> CLDR v41 ICU v70 -> CLDR v40 ICU v69 -> CLDR v39 ICU v68 -> CLDR v38 -Jeremy -- http://about.me/jeremy_schneider
Re: Collation version tracking for macOS
On Mon, Nov 28, 2022 at 11:49 PM Jeff Davis wrote: > Not necessarily, #2-4 (at least as implemented in v7) can only load one > major version at a time, so can't specify minor versions: > https://www.postgresql.org/message-id/9f8e9b5a3352478d4cf7d6c0a5dd7e82496be4b6.ca...@j-davis.com > > With #1, you can provide control over the search order to find the > symbol you want. Granted, if you want to specify that different > collations look in different libraries for the same version, then it > won't work, because the search order is global -- is that what you're > worried about? If so, I think we need to compare it against the > downsides of #2-4, which in my opinion are more serious. You know more about this than I do, for sure, so don't let my vote back the project into a bad spot. But, yeah, the thing you mention here is what I'm worried about. Without a way to force a certain behavior for a certain particular collation, you don't have an escape valve if the global library ordering isn't doing what you want. Your argument seems to at least partly be that #1 will be more usable on the whole, and that does seem like an important consideration. People may have a lot of collations and adjusting them all individually could be difficult and unpleasant. However, I think it's also worth asking what options someone has if #1 can't be made to work due to a single ordering controlling every collation. It's entirely possible that the scenario I'm worried about is too remote in practice to be concerned about. I don't know how this stuff works well enough to be certain. It's just that, on the basis of previous experience, (1) it's not that uncommon for people to actually end up in situations that we thought shouldn't ever happen and (2) code that deals with collations is more untrustworthy than average. -- Robert Haas EDB: http://www.enterprisedb.com
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Tue, Nov 29, 2022 at 8:42 AM SATYANARAYANA NARLAPURAM < satyanarlapu...@gmail.com> wrote: > > > On Tue, Nov 29, 2022 at 8:29 AM Bruce Momjian wrote: > >> On Tue, Nov 29, 2022 at 08:14:10AM -0800, SATYANARAYANA NARLAPURAM wrote: >> > 2. Process proc die immediately when a backend is waiting for sync >> > replication acknowledgement, as it does today, however, upon >> restart, >> > don't open up for business (don't accept ready-only connections) >> > unless the sync standbys have caught up. >> > >> > >> > Are you planning to block connections or queries to the database? It >> would be >> > good to allow connections and let them query the monitoring views but >> block the >> > queries until sync standby have caught up. Otherwise, this leaves a >> monitoring >> > hole. In cloud, I presume superusers are allowed to connect and monitor >> (end >> > customers are not the role members and can't query the data). The same >> can't be >> > true for all the installations. Could you please add more details on >> your >> > approach? >> >> I think ALTER SYSTEM should be allowed, particularly so you can modify >> synchronous_standby_names, no? > > > Yes, Change in synchronous_standby_names is expected in this situation. > IMHO, blocking all the connections is not a recommended approach. > How about allowing superusers (they can still read locally committed data) and users part of pg_monitor role? > >> >> -- >> Bruce Momjian https://momjian.us >> EDB https://enterprisedb.com >> >> Embrace your flaws. They make you human, rather than perfect, >> which you will never be. >> >
Re: Add 64-bit XIDs into PostgreSQL 15
On Tue, Nov 29, 2022 at 11:41:07AM -0500, Robert Haas wrote: > My argument is that removing xidStopLimit is totally fine, because it > only serves to stop the database. What to do about xidWarnLimit is a > slightly more complex question. Certainly it can't be left untouched, > because warning that we're about to shut down the database for lack of > allocatable XIDs is not sensible if there is no such lack and we > aren't going to shut it down. But I'm also not sure if the model is > right. Doing nothing for a long time and then warning in every > transaction when some threshold is crossed is an extreme behavior > change. Right now that's somewhat justified because we're about to hit > a brick wall at full speed, but if we remove the brick wall and > replace it with a gentle pelting with rotten eggs, it's unclear that a > similarly strenuous reaction is the right model. But that's also not > to say that we should do nothing at all. Yeah, we would probably need to warn on every 1 million transactions or something. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Tue, Nov 29, 2022 at 8:29 AM Bruce Momjian wrote: > On Tue, Nov 29, 2022 at 08:14:10AM -0800, SATYANARAYANA NARLAPURAM wrote: > > 2. Process proc die immediately when a backend is waiting for sync > > replication acknowledgement, as it does today, however, upon restart, > > don't open up for business (don't accept ready-only connections) > > unless the sync standbys have caught up. > > > > > > Are you planning to block connections or queries to the database? It > would be > > good to allow connections and let them query the monitoring views but > block the > > queries until sync standby have caught up. Otherwise, this leaves a > monitoring > > hole. In cloud, I presume superusers are allowed to connect and monitor > (end > > customers are not the role members and can't query the data). The same > can't be > > true for all the installations. Could you please add more details on your > > approach? > > I think ALTER SYSTEM should be allowed, particularly so you can modify > synchronous_standby_names, no? Yes, Change in synchronous_standby_names is expected in this situation. IMHO, blocking all the connections is not a recommended approach. > > -- > Bruce Momjian https://momjian.us > EDB https://enterprisedb.com > > Embrace your flaws. They make you human, rather than perfect, > which you will never be. >
Re: Add 64-bit XIDs into PostgreSQL 15
On Tue, Nov 29, 2022 at 8:03 AM Chris Travers wrote: > To be clear, I never suggested shutting down the database. What I have > suggested is that repurposing the current approaching-xid-wraparound warnings > to start complaining loudly when a threshold is exceeded would be helpful. > I think it makes sense to make that threshold configurable especially if we > eventually have people running bloat-free table structures. > > There are two fundamental problems here. The first is that if, as you say, a > table is progressively bloating and we are getting further and further behind > on vacuuming and freezing, something is seriously wrong and we need to do > something about it. In these cases, I my experience is that vacuuming and > related tools tend to suffer degraded performance, and determining how to > solve the problem takes quite a bit more time than a routine bloat issue > would. So what I am arguing against is treating the problem just as a bloat > issue. If you get there due to vacuum being slow, something else is wrong > and you are probably going to have to find and fix that as well in order to > catch up. At least that's my experience. > > I don't object to the db continuing to run, allocate xids etc. What I object > to is it doing so in silently where things are almost certainly going very > wrong. OK. My feeling is that the set of things we can do to warn the user is somewhat limited. I'm open to trying our best, but we need to have reasonable expectations. Sophisticated users will be monitoring for problems even if we do nothing to warn, and dumb ones won't look at their logs. Any feature that proposes to warn must aim at the uses who are smart enough to check the logs but dumb enough not to have any more sophisticated monitoring. Such users certainly exist and are not even uncommon, but they aren't the only kind by a long shot. My argument is that removing xidStopLimit is totally fine, because it only serves to stop the database. What to do about xidWarnLimit is a slightly more complex question. Certainly it can't be left untouched, because warning that we're about to shut down the database for lack of allocatable XIDs is not sensible if there is no such lack and we aren't going to shut it down. But I'm also not sure if the model is right. Doing nothing for a long time and then warning in every transaction when some threshold is crossed is an extreme behavior change. Right now that's somewhat justified because we're about to hit a brick wall at full speed, but if we remove the brick wall and replace it with a gentle pelting with rotten eggs, it's unclear that a similarly strenuous reaction is the right model. But that's also not to say that we should do nothing at all. -- Robert Haas EDB: http://www.enterprisedb.com
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Tue, Nov 29, 2022 at 08:14:10AM -0800, SATYANARAYANA NARLAPURAM wrote: > 2. Process proc die immediately when a backend is waiting for sync > replication acknowledgement, as it does today, however, upon restart, > don't open up for business (don't accept ready-only connections) > unless the sync standbys have caught up. > > > Are you planning to block connections or queries to the database? It would be > good to allow connections and let them query the monitoring views but block > the > queries until sync standby have caught up. Otherwise, this leaves a monitoring > hole. In cloud, I presume superusers are allowed to connect and monitor (end > customers are not the role members and can't query the data). The same can't > be > true for all the installations. Could you please add more details on your > approach? I think ALTER SYSTEM should be allowed, particularly so you can modify synchronous_standby_names, no? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
Re: Patch: Global Unique Index
On Tue, 2022-11-29 at 13:58 +0100, Vik Fearing wrote: > I disagree. A user does not need to know that a table is partitionned, > and if the user wants a unique constraint on the table then making them > type an extra word to get it is just annoying. Hmm. But if I created a primary key without thinking too hard about it, only to discover later that dropping old partitions has become a problem, I would not be too happy either. Yours, Laurenz Albe
Re: Collation version tracking for macOS
On 11/28/22 14:11, Robert Haas wrote: On Wed, Nov 23, 2022 at 12:09 AM Thomas Munro wrote: OK. Time for a new list of the various models we've discussed so far: 1. search-by-collversion: We introduce no new "library version" concept to COLLATION and DATABASE object and little or no new syntax. 2. lib-version-in-providers: We introduce a separate provider value for each ICU version, for example ICU63, plus an unversioned ICU like today. 3. lib-version-in-attributes: We introduce daticuversion (alongside datcollversion) and collicuversion (alongside collversion). Similar to the above, but it's a separate property and the provider is always ICU. New syntax for CREATE/ALTER COLLATION/DATABASE to set and change ICU_VERSION. 4. lib-version-in-locale: "63:en" from earlier versions. That was mostly a strawman proposal to avoid getting bogged down in syntax/catalogue/model change discussions while trying to prove that dlopen would even work. It doesn't sound like anyone really likes this. 5. lib-version-in-collversion: We didn't explicitly discuss this before, but you hinted at it: we could just use u_getVersion() in [dat]collversion. I'd like to vote against #3 at least in the form that's described here. If we had three more libraries providing collations, it's likely that they would need versioning, too. So if we add an explicit notion of provider version, then it ought not to be specific to libicu. +many I think it's OK to decide that different library versions are different providers (your option #2), or that they are the same provider but give rise to different collations (your option #4), or that there can be multiple version of each collation which are distinguished by some additional provider version field (your #3 made more generic). I think provider and collation version are distinct concepts. The provider ('c' versus 'i' for example) determines a unique code path in the backend due to different APIs, whereas collation version is related to a specific ordering given a set of characters. I don't really understand #1 or #5 well enough to have an educated opinion, but I do think that #1 seems a bit magical. It hopes that the combination of a collation name and a datcollversion will be sufficient to find exactly one matcing collation in a list of provided libraries. The advantage of that, as I understand it, is that if you do something to your system that causes the number of matches to go from one to zero, you can just throw another library on the pile and get the number back up to one. Woohoo! But there's a part of me that worries: what if the number goes up to two, and they're not all the same? Probably that's something that shouldn't happen, but if it does then I think there's kind of no way to fix it. With the other options, if there's some way to jigger the catalog state to match what you want to happen, you can always repair the situation somehow, because the library to be used for each collation is explicitly specified in some way, and you just have to get it to match what you want to have happen. My vote is for something like #5. The collversion should indicate a specific immutable ordering behavior. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Add 64-bit XIDs into PostgreSQL 15
On Mon, Nov 28, 2022 at 4:53 PM Peter Geoghegan wrote: > Imagine if we actually had 64-bit XIDs -- let's assume for a moment > that it's a done deal. This raises a somewhat awkward question: do you > just let the system get further and further behind on freezing, > forever? We can all agree that 2 billion XIDs is very likely the wrong > time to start refusing new XIDs -- because it just isn't designed with > debt in mind. But what's the right time, if any? How much debt is too > much? I simply don't see a reason to ever stop the server entirely. I don't even agree with the idea of slowing down XID allocation, let alone refusing it completely. When the range of allocated XIDs become too large, several bad things happen. First, we become unable to allocate new XIDs without corrupting the database. Second, pg_clog and other SLRUs become uncomfortably large. There may be some other things too that I'm not thinking about. But these things are not all equally bad. If these were medical problems, being unable to allocate new XIDs without data corruption would be a heart attack, and SLRUs getting bigger on disk would be acne. You don't handle problems of such wildly differing severity in the same way. When someone is having a heart attack, an ambulance rushes them to the hospital, running red lights as necessary. When someone has acne, you don't take them to the same hospital in the same ambulance and drive it at a slower rate of speed. You do something else entirely, and it's something that is in every way much less dramatic. There's no such thing as an attack of acne that's so bad that it requires an ambulance ride, but even a mild heart attack should result in a fast trip to the ER. So here. The two problems are so qualitatively different that the responses should also be qualitatively different. > Admittedly this argument works a lot better with the failsafe than it > does with xidStopLimit. Both are removed by the patch. I don't think the failsafe stuff should be removed, but it should probably be modified in some way. Running out of XIDs is the only valid reason for stopping the world, at least IMO, but it is definitely NOT the only reason for vacuuming more aggressively. -- Robert Haas EDB: http://www.enterprisedb.com
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
On Sun, Nov 27, 2022 at 10:33 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > On Mon, Nov 28, 2022 at 12:57 AM Andrey Borodin > wrote: > > > > Some funny stuff. If a user tries to cancel a non-replicated transaction > > Azure Postgres will answer: "user requested cancel while waiting for > > synchronous replication ack. The COMMIT record has already flushed to > > WAL locally and might not have been replicatead to the standby. We > > must wait here." > > AWS RDS will answer: "ignoring request to cancel wait for synchronous > > replication" > > Yandex Managed Postgres will answer: "canceling wait for synchronous > > replication due requested, but cancelation is not allowed. The > > transaction has already committed locally and might not have been > > replicated to the standby. We must wait here." > > > > So, for many services providing Postgres as a service it's only a > > matter of wording. > > Thanks for verifying the behaviour. And many thanks for an off-list chat. > > FWIW, I'm planning to prepare a patch as per the below idea which is > something similar to the initial proposal in this thread. Meanwhile, > thoughts are welcome. > > 1. Disable query cancel/CTRL+C/SIGINT when a backend is waiting for > sync replication acknowledgement. > +1 > 2. Process proc die immediately when a backend is waiting for sync > replication acknowledgement, as it does today, however, upon restart, > don't open up for business (don't accept ready-only connections) > unless the sync standbys have caught up. > Are you planning to block connections or queries to the database? It would be good to allow connections and let them query the monitoring views but block the queries until sync standby have caught up. Otherwise, this leaves a monitoring hole. In cloud, I presume superusers are allowed to connect and monitor (end customers are not the role members and can't query the data). The same can't be true for all the installations. Could you please add more details on your approach? > > -- > Bharath Rupireddy > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com >
Re: fixing CREATEROLE
On Tue, Nov 29, 2022 at 2:32 AM wrote: > I propose a slightly different syntax instead: > > ALTER DEFAULT PRIVILEGES GRANT CREATED ROLE TO role_specification WITH ...; > > This, together with the proposal above regarding the grantor, should be > consistent. I think that is more powerful than what I proposed but less fit for purpose. If alice is a CREATEROLE user and issues CREATE ROLE bob, my proposal allows alice to automatically obtain access to bob's privileges. Your proposal would allow that, but it would also allow alice to automatically confer bob's privileges on some third user, say charlie. Maybe that's useful to somebody, I don't know. But one significant disadvantage of this is that every CREATEROLE user must have their own configuration. If we have CREATE ROLE users alice, dave, and ellen, then allice needs to execute ALTER DEFAULT PRIVILEGES GRANT CREATED ROLE TO alice WITH ...; dave needs to do the same thing with dave instead of alice; and ellen needs to do the same thing with ellen instead of alice. There's no way to apply a system-wide configuration that applies nicely to all CREATEROLE users. A GUC would of course allow that, because it could be set in postgresql.conf and then overridden for particular databases, users, or sessions. David claims that "these aren't privileges, they are memberships." I don't entirely agree with that, because I think that we're basically using memberships as a pseudonym for privileges where roles are concerned. However, it is true that there's no precedent for referring to role grants using the keyword PRIVILEGES at the SQL level, and the fact that the underlying works in somewhat similar ways doesn't necessarily mean that it's OK to conflate the two concepts at the SQL level. So I'm still not very sold on this idea. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add 64-bit XIDs into PostgreSQL 15
On 11/29/22 09:46, Bruce Momjian wrote: As far as I know, all our freeze values are focused on avoiding XID wraparound. If XID wraparound is no longer an issue, we might find that our freeze limits can be much higher than they are now. I'd be careful in that direction as the values together with maintenance work mem also keep a lid on excessive index cleanup rounds. Regards, Jan
Re: fixing CREATEROLE
On Tue, Nov 29, 2022 at 12:32 AM wrote: > > Is there any other argument to be made against ADP? > These aren't privileges, they are memberships. The pg_default_acl catalog is also per-data while these settings should be present in a catalog which, like pg_authid, is catalog-wide. This latter point, for me, disqualifies the command itself from being used for this purpose. If we'd like to create ALTER DEFAULT MEMBERSHIP (and a corresponding cluster-wide catalog) then maybe the rest of the design would work within that. > > Note, that ADP allows much more than just creating a grant for the > CREATEROLE user, which would be the case if the default GRANT was made > TO the_create_role_user. But it could be made towards *other* users as > well, so you could do something like this: > > CREATE ROLE alice CREATEROLE; > CREATE ROLE bob; > > ALTER DEFAULT PRIVILEGES FOR alice GRANT CREATED ROLE TO bob WITH SET > TRUE, INHERIT FALSE; > What does that accomplish? bob cannot create roles to actually exercise his privilege. > This is much more flexible than role attributes or GUCs. > > The main advantage of GUC over a role attribute is that you can institute layers of defaults according to a given cluster's specific needs. ALTER ROLE SET (pg_db_role_setting - also cluster-wide) also comes into play; maybe alice wants auto-inherit while in db-a but not db-b (this would/will be more convincing if we end up having per-database roles). If we accept that some external configuration knowledge is going to influence the result of executing this command (Tom?) then it seems that all the features a GUC provides are desirable in determining how the final execution context is configured. Which makes sense as this kind of thing is precisely what the GUC subsystem was designed to handle - session context environments related to the user and database presently connected. David J.
Re: Support load balancing in libpq
Attached is a new version with the tests cleaned up a bit (more comments mostly). @Michael, did you have a chance to look at the last version? Because I feel that the patch is pretty much ready for a committer to look at, at this point. v5-0001-Support-load-balancing-in-libpq.patch Description: v5-0001-Support-load-balancing-in-libpq.patch
Re: Fix database creation during installchecks for ICU cluster
Hi, Thanks for the patch! On 10/29/22 12:54, Marina Polyakova wrote: 1) The ECPG tests fail because they use the SQL_ASCII encoding [2], the database template0 uses the ICU locale provider and SQL_ASCII is not supported by ICU: $ make -C src/interfaces/ecpg/ installcheck ... == creating database "ecpg1_regression" == ERROR: encoding "SQL_ASCII" is not supported with ICU provider ERROR: database "ecpg1_regression" does not exist command failed: "/home/marina/postgresql/master/my/inst/bin/psql" -X -c "CREATE DATABASE \"ecpg1_regression\" TEMPLATE=template0 ENCODING='SQL_ASCII'" -c "ALTER DATABASE \"ecpg1_regression\" SET lc_messages TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_monetary TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_numeric TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_time TO 'C';ALTER DATABASE \"ecpg1_regression\" SET bytea_output TO 'hex';ALTER DATABASE \"ecpg1_regression\" SET timezone_abbreviations TO 'Default';" "postgres" I can confirm that same error happens on my end and your patch fixes the issue. But, do ECPG tests really require SQL_ASCII encoding? I removed ECPG tests' encoding line [1], rebuilt it and 'make -C src/interfaces/ecpg/ installcheck' passed without applying your patch. 2) The option --no-locale in pg_regress is described as "use C locale" [3]. But in this case the created databases actually use the ICU locale provider with the ICU cluster locale from template0 (see diff_check_backend_used_provider.txt): $ make NO_LOCALE=1 installcheck This works on my end without applying your patch. Commands I used are: $ ./configure --with-icu --prefix=$PWD/build_dir $ make && make install && export PATH=$PWD/build_dir/bin:$PATH $ initdb --locale-provider icu --icu-locale en-US -D data && pg_ctl -D data -l logfile start $ make NO_LOCALE=1 installcheck [1] https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/interfaces/ecpg/test/Makefile#L18 Regards, Nazir Bilal Yavuz
Re: Add 64-bit XIDs into PostgreSQL 15
On Tue, Nov 29, 2022 at 02:35:20PM +0100, Chris Travers wrote: > So I think the problem is that PostgreSQL is becoming more and more scalabile, > hardware is becoming more capable, and certain use cases are continuing to > scale up. Over time, we tend to find ourselves approaching the end of the > runway at ever higher velocities. That's a problem that will get > significantly > worse over time. > > Of course, as I think we agree, the priorities should be (in order): > 1. Avoid trouble > 2. Recover from trouble early > 3. Provide more and better options for recovery. Warn about trouble is another area we should focus on here. > I think 64bit xids are a very good idea, but they really fit in this bottom > tier. Not being up against mathematical limits to the software when things > are going bad is certainly a good thing. But I am really worried about the > attitude that this patch really avoids trouble because in many cases, I don;t > think it does and therefore I believe we need to make sure we are not reducing > visibility of underlying problems. As far as I know, all our freeze values are focused on avoiding XID wraparound. If XID wraparound is no longer an issue, we might find that our freeze limits can be much higher than they are now. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
Re: Support logical replication of DDLs
On Tue, 29 Nov 2022 at 17:51, li jie wrote: > > I will continue to give feedback for this patch. Thanks a lot, that will be very helpful for us. > 1. LIKE STORAGE > ``` > CREATE TABLE ctlt (a text, c text); > ALTER TABLE ctlt ALTER COLUMN c SET STORAGE EXTERNAL; > CREATE TABLE ctlt_storage (LIKE ctlt INCLUDING STORAGE); > ``` > > postgres=# \d+ ctlt_storage > > Table "public.ctlt_storage" > > Column | Type | Collation | Nullable | Default | Storage | > Compression | Stats target | Description > > +--+---+--+-+--+-+--+- > > a | text | | | | extended | > | | > > c | text | | | | extended | > | | > > > It can be seen that the storage attribute in column C of table > ctlt_storage is not replicated. > > After the CREATE TABLE LIKE statement is converted, > the LIKE STORAGE attribute is lost because it is difficult to display > it in the CREATE TABLE syntax. > Maybe we need to add a statement to it, like 'ALTER TABLE ctlt_storage > ALTER COLUMN c SET STORAGE EXTERNAL;'. > > 2. Reference subcommand be dropped. > ``` > create table another (f1 int, f2 text, f3 text); > > alter table another > alter f1 type text using f2 || ' and ' || f3 || ' more', > alter f2 type bigint using f1 * 10, > drop column f3; > ``` > > The following error occurs downstream: > ERROR: column "?dropped?column?" does not exist at character 206 > STATEMENT: ALTER TABLE public.another DROP COLUMN f3 , ALTER COLUMN > f1 SET DATA TYPE pg_catalog.text COLLATE pg_catalog."default" USING > (((f2 OPERATOR(pg_catalog.||) ' and '::pg_catalog.text) > OPERATOR(pg_catalog.||) "?dropped?column?") OPERATOR(pg_catalog.||) ' > more'::pg_catalog.text), ALTER COLUMN f2 SET DATA TYPE pg_catalog.int8 > USING (f1 OPERATOR(pg_catalog.*) 10) > > Obviously, column f3 has been deleted and its name no longer exists. > Maybe we need to keep it and save it in advance like a drop object. > However, ATLER TABLE is complex, and this problem also occurs in > other similar scenarios. I will analyze these issues and post a patch to handle it. Regards, Vignesh
Re: Privileges on PUBLICATION
Antonin Houska wrote: > Peter Eisentraut wrote: > > > On 04.11.22 08:28, Antonin Houska wrote: > > > I thought about the whole concept a bit more and I doubt if the > > > PUBLICATION > > > privilege is the best approach. In particular, the user specified in > > > CREATE > > > SUBSCRIPTION ... CONNECTION ... (say "subscription user") needs to have > > > SELECT > > > privilege on the tables replicated. So if the DBA excludes some columns > > > from > > > the publication's column list and sets the (publication) privileges in > > > such a > > > way that the user cannot get the column values via other publications, the > > > user still can connect to the database directly and get values of the > > > excluded > > > columns. > > > > Why are the SELECT privileges needed? Maybe that's something to think about > > and maybe change. > > I haven't noticed an explanation in comments nor did I search in the mailing > list archives, but the question makes sense: the REPLICATION attribute of a > role is sufficient for streaming replication, so why should the logical > replication require additional privileges? > > Technically the SELECT privilege is needed because the sync worker does > actually execute SELECT query on the published tables. However, I realize now > that it's not checked by the output plugin. Thus if SELECT is revoked from the > "subscription user" after the table has been synchronized, the replication > continues to work. So the necessity for the SELECT privilege might be an > omission rather than a design choice. (Even the documentation says that the > SELECT privilege is needed only for the initial synchronization [1], however > it does not tell why.) > > > > As an alternative to the publication privileges, I think that the CREATE > > > SUBSCRIPTION command could grant ACL_SELECT automatically to the > > > subscription > > > user on the individual columns contained in the publication column list, > > > and > > > DROP SUBSCRIPTION would revoke that privilege. > > > > I think that approach is weird and unusual. Privileges and object creation > > should be separate operations. > > ok. Another approach would be to skip the check for the SELECT privilege (as > well as the check for the USAGE privilege on the corresponding schema) if > given column is being accessed via a publication which has it on its column > list and if the subscription user has the USAGE privilege on that publication. > > So far I wasn't sure if we can do that because, if pg_upgrade grants the USAGE > privilege on all publications to the "public" role, the DBAs who relied on the > SELECT privileges might not notice that any role having the REPLICATION > attribute can access all the published tables after the upgrade. (pg_upgrade > can hardly do anything else because it has no information on the "subscription > users", so it cannot convert the SELECT privilege on tables to the USAGE > privileges on publications.) > > But now that I see that the logical replication doesn't check the SELECT > privilege properly anyway, I think we can get rid of it. The attached version tries to do that - as you can see in 0001, the SELECT privilege is not required for the walsender process. I also added PUBLICATION_NAMES option to the COPY TO command so that the publisher knows which publications are subject to the ACL check. Only data of those publications are returned to the subscriber. (In the previous patch version the ACL checks were performed on the subscriber side, but I that's not ideal in terms of security.) I also added the regression tests for publications, enhanced psql (the \dRp+ command) so that it displays the publication ACL and added a few missing pieces of documentation. -- Antonin Houska Web: https://www.cybertec-postgresql.com publication_privileges_v03.tgz Description: application/gzip
Re: Introduce a new view for checkpointer related stats
Hi, On 11/28/22 6:58 PM, Robert Haas wrote: On Tue, Nov 22, 2022 at 3:53 PM Andres Freund wrote: I think we should consider deprecating the pg_stat_bgwriter columns but leaving them in place for a few years. New stuff should only be added to pg_stat_checkpointer, but we don't need to break old monitoring queries. I vote to just remove them. I think that most people won't update their queries until they are forced to do so. I don't think it matters very much when we force them to do that. Our track record in following through on deprecations is pretty bad too, which is another consideration. Same point of view. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
'Flexible "partition pruning" hook' redux?
There's an old thread that interests me but which ended without any resolution/solution: https://www.postgresql.org/messageid/CA%2BHiwqGpoKmDg%2BTJdMfoeu3k4VQnxcfBon4fwBRp8ex_CTCsnA%40mail.gmail.com Some of our basic requirements are: 1) All data must be labeled at a specific level (using SELinux multi-level security (MLS) policy). 2) Data of different levels cannot be stored in the same file on disk. 3) The Bell-LaPadula model must be applied meaning read (select) down (return rows labeled at levels dominated by the querying processes level) is allowed, updates (insert/update/delete) can only be done to data at the same level as executing process. BLM allows for write up but in reality since processes don't know about levels which dominate theirs this doesn't happen. In the past I've used RLS, sepgsql and some additional custom functions to create MLS databases but this does not satisfy #2. Partitioning looks to be a way to achieve #2 and to possibly improve query performance since partitions could be pruned based on the level of data stored in them. However I'm not aware of a means to implement table level dominance pruning. The patch, in the thread noted above, proposed a hook to allow customized pruning of partitions which is something I think would be useful. However a number of questions and concerns were raised (some beyond my ability to even comprehend since I don't have intimate knowledge of the code base) but never addressed. What's the best way forward in a situation like this? Ted
Re: Add 64-bit XIDs into PostgreSQL 15
On Mon, Nov 28, 2022 at 11:06 PM Peter Geoghegan wrote: > On Mon, Nov 28, 2022 at 1:52 PM Bruce Momjian wrote: > > I think the problem is that we still have bloat with 64-bit XIDs, > > specifically pg_xact and pg_multixact files. Yes, that bloat is less > > serious, but it is still an issue worth reporting in the server logs, > > though not serious enough to stop the server from write queries. > > That's definitely a big part of it. > > Again, I don't believe that the idea is fundamentally without merit. > Just that it's not worth it, given that having more XID space is very > much not something that I think fixes most of the problems. And given > the real risk of serious bugs with something this invasive. > I believe that it would be more useful to focus on just not getting > into trouble in the first place, as well as on mitigating specific > problems that lead to the system reaching xidStopLimit in practice. I > don't think that there is any good reason to allow datfrozenxid to go > past about a billion. When it does the interesting questions are > questions about what went wrong, and how that specific failure can be > mitigated in a fairly direct way. > > We've already used way to much "XID space runway", so why should using > even more help? It might, I suppose, but it almost seems like a > distraction to me, as somebody that wants to make things better for > users in general. As long as the system continues to misbehave (in > whatever way it happens to be misbehaving), why should any amount of > XID space ever be enough? > So I think the problem is that PostgreSQL is becoming more and more scalabile, hardware is becoming more capable, and certain use cases are continuing to scale up. Over time, we tend to find ourselves approaching the end of the runway at ever higher velocities. That's a problem that will get significantly worse over time. Of course, as I think we agree, the priorities should be (in order): 1. Avoid trouble 2. Recover from trouble early 3. Provide more and better options for recovery. I think 64bit xids are a very good idea, but they really fit in this bottom tier. Not being up against mathematical limits to the software when things are going bad is certainly a good thing. But I am really worried about the attitude that this patch really avoids trouble because in many cases, I don;t think it does and therefore I believe we need to make sure we are not reducing visibility of underlying problems. > > I think that we'll be able to get rid of freezing in a few years time. > But as long as we have freezing, we have these problems. > > -- > Peter Geoghegan >
Re: Logical Replication Custom Column Expression
On Fri, Nov 25, 2022 at 4:13 PM Stavros Koureas wrote: > > Yes, if the property is on the subscription side then it should be applied > for all the tables that the connected publication is exposing. > So if the property is enabled you should be sure that this origin column > exists to all of the tables that the publication is exposing... > That would be too restrictive - not necessarily in your application but generally. There could be some tables where consolidating rows with same PK from different publishers into a single row in subscriber would be desirable. I think we need to enable the property for every subscriber that intends to add publisher column to the desired and subscribed tables. But there should be another option per table which will indicate that receiver should add publisher when INSERTING row to that table. > Sure this is the complete idea, that the subscriber should match the PK of > origin, > As the subscription table will contain same key values from different > origins, for example: > And yes, probably you need to change the way you reply to email on this list. Top-posting is generally avoided. See https://wiki.postgresql.org/wiki/Mailing_Lists. -- Best Wishes, Ashutosh Bapat
Re: SUBTRANS: Minimizing calls to SubTransSetParent()
Hi Tomas, Le mar. 29 nov. 2022 à 14:06, Tomas Vondra a écrit : > > On 11/29/22 13:49, Simon Riggs wrote: > > On Thu, 17 Nov 2022 at 17:29, Simon Riggs > > wrote: > > > >> (yes, the last line shows x10 performance patched, that is not a typo) > > > > New version of patch, now just a one-line patch! > > > > Results show it's still a good win for XidInMVCCSnapshot(). > > > > I'm a bit confused - for which workload/benchmark are there results? > It's generally a good idea to share the scripts used to run the test and > not just a chart. The scripts have been attached to this thread with the initial performance results. Anyway, re-sending those (including a minor fix). -- Julien Tachoires EDB subtrans-benchmark.tar.gz Description: GNU Zip compressed data
Re: SUBTRANS: Minimizing calls to SubTransSetParent()
On 11/29/22 13:49, Simon Riggs wrote: > On Thu, 17 Nov 2022 at 17:29, Simon Riggs > wrote: > >> (yes, the last line shows x10 performance patched, that is not a typo) > > New version of patch, now just a one-line patch! > > Results show it's still a good win for XidInMVCCSnapshot(). > I'm a bit confused - for which workload/benchmark are there results? It's generally a good idea to share the scripts used to run the test and not just a chart. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Add 64-bit XIDs into PostgreSQL 15
Hi; I suppose I must not have been clear in what I am suggesting we do and why. I will try to answer specific points below and then restate what I think the problem is, and what I think should be done about it. On Mon, Nov 28, 2022 at 5:53 PM Robert Haas wrote: > On Sat, Nov 26, 2022 at 4:08 AM Chris Travers > wrote: > > I didn't see any changes to pg_upgrade to make this change possible on > upgrade. Is that also outside of the scope of your patch set? If so how > is that continuity supposed to be ensured? > > The scheme is documented in their 0006 patch, in a README.XID file. > I'm not entirely confident that it's the best design and have argued > against it in the past, but it's not crazy. > Right. Per previous discussion I thought there was some discussion of allowing people to run with the existing behavior.I must have been mistaken. If that is off the table then pg_upgrade and runtime replication checks don't matter. > > More generally, while I think there's plenty of stuff to be concerned > about in this patch set and while I'm somewhat skeptical about the > likelihood of its getting or staying committed, I can't really > understand your concerns in particular. The thrust of your concern > seems to be that if we allow people to get further behind, recovery > will be more difficult. I'm not sure I see the problem. Suppose that > we adopt this proposal and that it is bug-free. Now, consider a user > who gets 8 billion XIDs behind. They probably have to vacuum pretty > much every page in the database to do that, or least every page in the > tables that haven't been vacuumed recently. But that would likely also > be true if they were 800 million XIDs behind, as is possible today. > The effort to catch up doesn't increase linearly with how far behind > you are, and is always bounded by the DB size. > Right. I agree with all of that. > > It is true that if the table is progressively bloating, it is likely > to be more bloated by the time you are 8 billion XIDs behind than it > was when you were 800 million XIDs behind. I don't see that as a very > good reason not to adopt this patch, because you can bloat the table > by an arbitrarily large amount while consuming only a small number of > XiDs, even just 1 XID. Protecting against bloat is good, but shutting > down the database when the XID age reaches a certain value is not a > particularly effective way of doing that, so saying that we'll be > hurting people by not shutting down the database at the point where we > do so today doesn't ring true to me. I think that most people who get > to the point of wraparound shutdown have workloads where bloat isn't a > huge issue, because those who do start having problems with the bloat > way before they run out of XIDs. > To be clear, I never suggested shutting down the database. What I have suggested is that repurposing the current approaching-xid-wraparound warnings to start complaining loudly when a threshold is exceeded would be helpful. I think it makes sense to make that threshold configurable especially if we eventually have people running bloat-free table structures. There are two fundamental problems here. The first is that if, as you say, a table is progressively bloating and we are getting further and further behind on vacuuming and freezing, something is seriously wrong and we need to do something about it. In these cases, I my experience is that vacuuming and related tools tend to suffer degraded performance, and determining how to solve the problem takes quite a bit more time than a routine bloat issue would. So what I am arguing against is treating the problem just as a bloat issue. If you get there due to vacuum being slow, something else is wrong and you are probably going to have to find and fix that as well in order to catch up. At least that's my experience. I don't object to the db continuing to run, allocate xids etc. What I object to is it doing so in silently where things are almost certainly going very wrong. > > It would be entirely possible to add a parameter to the system that > says "hey, you know we can keep running even if we're a shazillion > XiDs behind, but instead shut down when we are behind by this number > of XIDs." Then, if somebody wants to force an automatic shutdown at > that point, they could, and I think that then the scenario you're > worried about just can't happen any more . But isn't that a little bit > silly? You could also just monitor how far behind you are and page the > DBA when you get behind by more than a certain number of XIDs. Then, > you wouldn't be risking a shutdown, and you'd still be able to stay on > top of the XID ages of your tables. > > Philosophically, I disagree with the idea of shutting down the > database completely in any situation in which a reasonable alternative > exists. Losing read and write availability is really bad, and I don't > think it's what users want. I think that most users want the database > to
Re: Patch: Global Unique Index
On 11/24/22 19:15, Cary Huang wrote: On Thu, 24 Nov 2022 08:00:59 -0700 Thomas Kellerer wrote --- > Pavel Stehule schrieb am 24.11.2022 um 07:03: > > There are many Oracle users that find global indexes useful despite > > their disadvantages. > > > > I have seen this mostly when the goal was to get the benefits of > > partition pruning at runtime which turned the full table scan (=Seq Scan) > > on huge tables to partition scans on much smaller partitions. > > Partition wise joins were also helpful for query performance. > > The substantially slower drop partition performance was accepted in thos cases > > > > > > I think it would be nice to have the option in Postgres as well. > > > > I do agree however, that the global index should not be created automatically. > > > > Something like CREATE GLOBAL [UNIQUE] INDEX ... would be a lot better > > > > > > Is it necessary to use special marks like GLOBAL if this index will > > be partitioned, and uniqueness will be ensured by repeated > > evaluations? > > > > Or you think so there should be really forced one relation based > > index? > > > > I can imagine a unique index on partitions without a special mark, > > that will be partitioned, and a second variant classic index created > > over a partitioned table, that will be marked as GLOBAL. > > > My personal opinion is, that a global index should never be created > automatically. > > The user should consciously decide on using a feature > that might have a serious impact on performance in some areas. Agreed, if a unique index is created on non-partition key columns without including the special mark (partition key columns), it may be a mistake from user. (At least I make this mistake all the time). Current PG will give you a warning to include the partition keys, which is good. If we were to automatically turn that into a global unique index, user may be using the feature without knowing and experiencing some performance impacts (to account for extra uniqueness check in all partitions). I disagree. A user does not need to know that a table is partitionned, and if the user wants a unique constraint on the table then making them type an extra word to get it is just annoying. -- Vik Fearing
Re: Bug in wait time when waiting on nested subtransaction
On Mon, 28 Nov 2022 at 21:10, Robert Haas wrote: > > On Mon, Nov 28, 2022 at 3:01 PM Tom Lane wrote: > > One thing we need to be pretty careful of here is to not break the > > promise of atomic commit. At topmost commit, all subxacts must > > appear committed simultaneously. It's not quite clear to me whether > > we need a similar guarantee in the rollback case. It seems like > > we shouldn't, but maybe I'm missing something, in which case maybe > > the current behavior is correct? > > In short, I think Simon is right that there's a problem and right > about which commit caused it, but I'm not sure what I think we ought > to do about it. I'm comfortable with ignoring it, on the basis that it *is* a performance optimization, but I suggest we keep the test (with modified output) and document the behavior, if we do. The really big issue is the loss of performance we get from having subtrans point only to its immediate parent, which makes XidInMVCCSnapshot() go really slow in the presence of lots of subtransactions. So ignoring the issue on this thread will open the door for the optimization posted for this patch: https://commitfest.postgresql.org/40/3806/ -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Introduce a new view for checkpointer related stats
On Mon, Nov 28, 2022 at 11:29 PM Robert Haas wrote: > > On Tue, Nov 22, 2022 at 3:53 PM Andres Freund wrote: > > I think we should consider deprecating the pg_stat_bgwriter columns but > > leaving them in place for a few years. New stuff should only be added to > > pg_stat_checkpointer, but we don't need to break old monitoring queries. > > I vote to just remove them. I think that most people won't update > their queries until they are forced to do so. I don't think it > matters very much when we force them to do that. > +1. -- With Regards, Amit Kapila.
Re: SUBTRANS: Minimizing calls to SubTransSetParent()
On Thu, 17 Nov 2022 at 17:29, Simon Riggs wrote: > (yes, the last line shows x10 performance patched, that is not a typo) New version of patch, now just a one-line patch! Results show it's still a good win for XidInMVCCSnapshot(). -- Simon Riggshttp://www.EnterpriseDB.com/ subtrans_points_to_topxid.v13.patch Description: Binary data
Re: GUC for temporarily disabling event triggers
> On 3 Nov 2022, at 21:47, Daniel Gustafsson wrote: > The patch adds a new GUC, ignore_event_trigger with two option values, 'all' > and 'none' (the login event patch had 'login' as well). The attached v2 fixes a small bug which caused testfailures the CFBot. -- Daniel Gustafsson https://vmware.com/ v2-0001-Add-GUC-for-temporarily-disabling-event-triggers.patch Description: Binary data
Re: Patch: Global Unique Index
On Fri, Nov 18, 2022 at 12:03:53PM +0300, Sergei Kornilov wrote: > Hello > Do we need new syntax actually? I think that a global unique index can be > created automatically instead of raising an error "unique constraint on > partitioned table must include all partitioning columns" I may suggest even more of the new syntax. If someone has to implement sequential index checking on unique constraints, then it would be useful to be able to do that inde- pendent of partitioning also. E.g. for some kinds of manual partitions or for strangely de- signed datasets. Or for some of the table partitions instead for all of them. For that reason, perhaps some other type of unique index -- that is not an index per se, but a check against a set of indexes -- could be added. Or, perhaps, not an index, but an EXCLUDE con- straint of that kind.
Re: Perform streaming logical transactions by background workers and parallel apply
On Tue, Nov 29, 2022 at 10:18 AM houzj.f...@fujitsu.com wrote: > > Attach the new version patch which addressed all comments. > Review comments on v53-0001* == 1. Subscription *MySubscription = NULL; -static bool MySubscriptionValid = false; +bool MySubscriptionValid = false; It seems still this variable is used in worker.c, so why it's scope changed? 2. /* fields valid only when processing streamed transaction */ -static bool in_streamed_transaction = false; +bool in_streamed_transaction = false; Is it really required to change the scope of this variable? Can we think of exposing a macro or inline function to check it in applyparallelworker.c? 3. should_apply_changes_for_rel(LogicalRepRelMapEntry *rel) { if (am_tablesync_worker()) return MyLogicalRepWorker->relid == rel->localreloid; + else if (am_parallel_apply_worker()) + { + if (rel->state != SUBREL_STATE_READY) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("logical replication parallel apply worker for subscription \"%s\" will stop", Is this check sufficient? What if the rel->state is SUBREL_STATE_UNKNOWN? I think that will be possible when the refresh publication has not been yet performed after adding a new relation to the publication. If that is true then won't we need to simply ignore that change and continue instead of erroring out? Can you please once test and check this case? 4. + + case TRANS_PARALLEL_APPLY: + list_free(subxactlist); + subxactlist = NIL; + + apply_handle_commit_internal(_data); I don't think we need to retail pfree subxactlist as this is allocated in TopTransactionContext and will be freed at commit/prepare. This way freeing looks a bit adhoc to me and you need to expose this list outside applyparallelworker.c which doesn't seem like a good idea to me either. 5. + apply_handle_commit_internal(_data); + + pa_set_xact_state(MyParallelShared, PARALLEL_TRANS_FINISHED); + pa_unlock_transaction(xid, AccessShareLock); + + elog(DEBUG1, "finished processing the transaction finish command"); I think in this and similar DEBUG logs, we can tell the exact command instead of writing 'finish'. 6. apply_handle_stream_commit() { ... + /* + * After sending the data to the parallel apply worker, wait for + * that worker to finish. This is necessary to maintain commit + * order which avoids failures due to transaction dependencies and + * deadlocks. + */ + pa_wait_for_xact_finish(winfo); + + pgstat_report_stat(false); + store_flush_position(commit_data.end_lsn); + stop_skipping_changes(); + + (void) pa_free_worker(winfo, xid); ... } apply_handle_stream_prepare(StringInfo s) { + + /* + * After sending the data to the parallel apply worker, wait for + * that worker to finish. This is necessary to maintain commit + * order which avoids failures due to transaction dependencies and + * deadlocks. + */ + pa_wait_for_xact_finish(winfo); + (void) pa_free_worker(winfo, prepare_data.xid); - /* unlink the files with serialized changes and subxact info. */ - stream_cleanup_files(MyLogicalRepWorker->subid, prepare_data.xid); + in_remote_transaction = false; + + store_flush_position(prepare_data.end_lsn); In both of the above functions, we should be consistent in calling pa_free_worker() function which I think should be immediately after pa_wait_for_xact_finish(). Is there a reason for not being consistent here? 7. + res = shm_mq_receive(winfo->error_mq_handle, , , true); + + /* + * The leader will detach from the error queue and set it to NULL + * before preparing to stop all parallel apply workers, so we don't + * need to handle error messages anymore. + */ + if (!winfo->error_mq_handle) + continue; This check must be done before calling shm_mq_receive. So, changed it in the attached patch. 8. @@ -2675,6 +3156,10 @@ store_flush_position(XLogRecPtr remote_lsn) { FlushPosition *flushpos; + /* Skip for parallel apply workers. */ + if (am_parallel_apply_worker()) + return; It is okay to always update the flush position by leader apply worker but I think the leader won't have updated value for XactLastCommitEnd as the local transaction is committed by parallel apply worker. 9. @@ -3831,11 +4366,11 @@ ApplyWorkerMain(Datum main_arg) ereport(DEBUG1, (errmsg_internal("logical replication apply worker for subscription \"%s\" two_phase is %s", - MySubscription->name, - MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_DISABLED ? "DISABLED" : - MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING ? "PENDING" : - MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED ? "ENABLED" : - "?"))); + MySubscription->name, + MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_DISABLED ? "DISABLED" : + MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING ? "PENDING" : + MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED ? "ENABLED" : + "?"))); Is this change related to this patch? 10. What is the reason to expose
Re: Reducing power consumption on idle servers
On Sun, 20 Nov 2022 at 20:00, Simon Riggs wrote: > > On Thu, 24 Mar 2022 at 16:21, Robert Haas wrote: > > > > On Thu, Mar 24, 2022 at 12:02 PM Simon Riggs > > > > What changes will be acceptable for bgwriter, walwriter and logical > > > worker? > > > > Hmm, I think it would be fine to introduce some kind of hibernation > > mechanism for logical workers. bgwriter and wal writer already have a > > hibernation mechanism, so I'm not sure what your concern is there > > exactly. In your initial email you said you weren't proposing changes > > there, but maybe that changed somewhere in the course of the > > subsequent discussion. If you could catch me up on your present > > thinking that would be helpful. > > Now that we seem to have solved the problem for Startup process, let's > circle back to the others Thanks for committing changes to the startup process. > Bgwriter does hibernate currently, but only at 50x the bgwriter_delay. > At default values that is 5s, but could be much less. So we need to > move that up to 60s, same as others. > > WALwriter also hibernates currently, but only at 25x the > wal_writer_delay. At default values that is 2.5s, but could be much > less. So we need to move that up to 60s, same as others. At the same > time, make sure that when we hibernate we use a new WaitEvent, > similarly to the way bgwriter reports its hibernation state (which > also helps test the patch). Re-attaching patch for bgwriter and walwriter, so it is clear this is not yet committed. I've added this to the next CF, since the entry was closed when the startup patch was committed. -- Simon Riggshttp://www.EnterpriseDB.com/ hibernate_bgwriter_walwriter.v5.patch Description: Binary data
Re: Support logical replication of DDLs
I will continue to give feedback for this patch. 1. LIKE STORAGE ``` CREATE TABLE ctlt (a text, c text); ALTER TABLE ctlt ALTER COLUMN c SET STORAGE EXTERNAL; CREATE TABLE ctlt_storage (LIKE ctlt INCLUDING STORAGE); ``` postgres=# \d+ ctlt_storage Table "public.ctlt_storage" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +--+---+--+-+--+-+--+- a | text | | | | extended | | | c | text | | | | extended | | | It can be seen that the storage attribute in column C of table ctlt_storage is not replicated. After the CREATE TABLE LIKE statement is converted, the LIKE STORAGE attribute is lost because it is difficult to display it in the CREATE TABLE syntax. Maybe we need to add a statement to it, like 'ALTER TABLE ctlt_storage ALTER COLUMN c SET STORAGE EXTERNAL;'. 2. Reference subcommand be dropped. ``` create table another (f1 int, f2 text, f3 text); alter table another alter f1 type text using f2 || ' and ' || f3 || ' more', alter f2 type bigint using f1 * 10, drop column f3; ``` The following error occurs downstream: ERROR: column "?dropped?column?" does not exist at character 206 STATEMENT: ALTER TABLE public.another DROP COLUMN f3 , ALTER COLUMN f1 SET DATA TYPE pg_catalog.text COLLATE pg_catalog."default" USING (((f2 OPERATOR(pg_catalog.||) ' and '::pg_catalog.text) OPERATOR(pg_catalog.||) "?dropped?column?") OPERATOR(pg_catalog.||) ' more'::pg_catalog.text), ALTER COLUMN f2 SET DATA TYPE pg_catalog.int8 USING (f1 OPERATOR(pg_catalog.*) 10) Obviously, column f3 has been deleted and its name no longer exists. Maybe we need to keep it and save it in advance like a drop object. However, ATLER TABLE is complex, and this problem also occurs in other similar scenarios. Thoughts? Adger.
Re: O(n) tasks cause lengthy startups and checkpoints
On Mon, 28 Nov 2022 at 23:40, Nathan Bossart wrote: > > Okay, here is a new patch set. 0004 adds logic to prevent custodian tasks > from delaying shutdown. That all seems good, thanks. The last important point for me is tests, in src/test/modules probably. It might be possible to reuse the final state of other modules' tests to test cleanup, or at least integrate a custodian test into each module. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Hi! I'm working on this issue according to the plan Tom proposed above - >I agree that we can't simply widen varatt_external to use 8 bytes for >the toast ID in all cases. Also, I now get the point about avoiding >use of globally assigned OIDs here: if the counter starts from zero >for each table, then a variable-width varatt_external could actually >be smaller than currently for many cases. However, that bit is somewhat >orthogonal, and it's certainly not required for fixing the basic problem. Have I understood correctly that you suppose using an individual counter for each TOAST table? I'm working on this approach, so we store counters in cache, but I see an issue with the first insert - when there is no counter in cache so we have to loop through the table with increasing steps to find available one (i.e. after restart). Or we still use single global counter, but 64-bit with a wraparound? >So it seems like the plan of attack ought to be: >1. Invent a new form or forms of varatt_external to allow different >widths of the toast ID. Use the narrowest width possible for any >given ID value. I'm using the VARTAG field - there are plenty of available values, so there is no problem in distinguishing regular toast pointer with 'short' value id (4 bytes) from long (8 bytes). varatt_external currently is 32-bit aligned, so there is no reason in using narrower type for value ids up to 16 bits.Or is it? >2. Allow TOAST tables/indexes to store either 4-byte or 8-byte IDs. >(Conversion could be done as a side effect of table-rewrite >operations, perhaps.) Still on toast/detoast part, would get to that later. >3. Localize ID selection so that tables can have small toast IDs >even when other tables have many IDs. (Optional, could do later.) > Thank you! -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Reducing power consumption on idle servers
On Mon, 28 Nov 2022 at 23:16, Thomas Munro wrote: > > I found some more comments and some documentation to word-smith very > lightly, and pushed. Thanks > The comments were stray references to the > trigger file. It's > a little confusing because the remaining mechanism also uses a file, > but it uses a signal first so seems better to refer to promotion > requests rather than files. ..and again > /me wonders how many idle servers there are in the world My estimate is there are 100 million servers in use worldwide, with only about 1% of them on a continuously busy duty cycle and most of them not in the cloud. If we guess that we save 10W when idle, then that saves some proportion of a GW. It's not a huge contribution to the effort, but if by doing this we inspire others to do the same, we may yet make a difference. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Non-decimal integer literals
On Wed, 23 Nov 2022 at 08:56, David Rowley wrote: > > On Wed, 23 Nov 2022 at 21:54, David Rowley wrote: > > I wonder if you'd be better off with something like: > > > > while (*ptr && isxdigit((unsigned char) *ptr)) > > { > > if (unlikely(tmp & UINT64CONST(0xF000))) > > goto out_of_range; > > > > tmp = (tmp << 4) | hexlookup[(unsigned char) *ptr++]; > > } > > Here's a delta diff with it changed to work that way. > This isn't correct, because those functions are meant to accumulate a negative number in "tmp". The overflow check can't just ignore the final digit either, so I'm not sure how much this would end up saving once those issues are fixed. Regards, Dean
Re: [PoC] Reducing planning time when tables have many partitions
Dear Andrey and Thom, Thank you for reviewing and testing the patch. I really apologize for my late response. On Tue, Nov 8, 2022 at 8:31 PM Andrey Lepikhov wrote: > Looking into find_em_for_rel() changes I see that you replaced > if (bms_is_subset(em->em_relids, rel->relids) > with assertion statement. > According of get_ecmember_indexes(), the em_relids field of returned > equivalence members can contain relids, not mentioned in the relation. > I don't understand, why it works now? For example, we can sort by t1.x, > but have an expression t1.x=t1.y*t2.z. Or I've missed something? If it > is not a mistake, maybe to add a comment why assertion here isn't failed? As you pointed out, changing the bms_is_subset() condition to an assertion is logically incorrect here. Thank you for telling me about it. I fixed it and attached the modified patch to this email. On Thu, Nov 17, 2022 at 9:05 PM Thom Brown wrote: > No issues with applying. Created 1024 partitions, each of which is > partitioned into 64 partitions. > > I'm getting a generic planning time of 1415ms. Is that considered > reasonable in this situation? Bear in mind that the planning time > prior to this patch was 282311ms, so pretty much a 200x speedup. Thank you for testing the patch with an actual query. This speedup is very impressive. When I used an original query with 1024 partitions, its planning time was about 200ms. Given that each partition is also partitioned in your workload, I think the result of 1415ms is reasonable. -- Best regards, Yuya Watari v9-0001-Apply-eclass_member_speedup_v3.patch.patch Description: Binary data v9-0002-Revert-one-of-the-optimizations.patch Description: Binary data v9-0003-Use-conventional-algorithm-for-smaller-size-cases.patch Description: Binary data v9-0004-Fix-incorrect-assertion.patch Description: Binary data
RE: Avoid distributing new catalog snapshot again for the transaction being decoded.
On Sat, Nov 26, 2022 at 19:50 PM Amit Kapila wrote: > On Fri, Nov 25, 2022 at 5:30 PM Ashutosh Bapat > wrote: > > > > Hi Hou, > > Thanks for the patch. With a simple condition, we can eliminate the > > need to queueing snapshot change in the current transaction and then > > applying it. Saves some memory and computation. This looks useful. > > > > When the queue snapshot change is processed in > > ReorderBufferProcessTXN(), we call SetupHistoricSnapshot(). But I > > didn't find a path through which SetupHistoricSnapshot() is called > > from SnapBuildCommitTxn(). > > > > It will be called after SnapBuildCommitTxn() via ReorderBufferCommit()- > >ReorderBufferReplay()->ReorderBufferProcessTXN(). > But, I think what I don't see is how the snapshot we build in > SnapBuildCommitTxn() will be assigned as a historic snapshot. We assign > base_snapshot as a historic snapshot and the new snapshot we build in > SnapBuildCommitTxn() is assigned as base_snapshot only if the same is not set > previously. I might be missing something but if that is true then I don't > think the > patch is correct, OTOH I would expect some existing tests to fail if this > change is > incorrect. Hi, I also think that the snapshot we build in SnapBuildCommitTxn() will not be assigned as a historic snapshot. But I think that when the commandID message is processed in the function ReorderBufferProcessTXN, the snapshot of the current transaction will be updated. And I also did some tests and found no problems. Here is my detailed analysis: I think that when a transaction internally modifies the catalog, a record of XLOG_HEAP2_NEW_CID will be inserted into the WAL (see function log_heap_new_cid). Then during logical decoding, this record will be decoded into a change of type REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID (see function ReorderBufferAddNewCommandId). And I think the function ReorderBufferProcessTXN would update the HistoricSnapshot (member subxip and member curcid of snapshot) when processing this change. And here is only one small comment: +* We've already done required modifications in snapshot for the +* transaction that just committed, so there's no need to add a new +* snapshot for the transaction again. +*/ + if (xid == txn->xid) + continue; This comment seems a bit inaccurate. How about the comment below? ``` We don't need to add a snapshot to the transaction that just committed as it will be able to see the new catalog contents after processing the new commandID in ReorderBufferProcessTXN. ``` Regards, Wang wei