Re: Could not run generate_unaccent_rules.py script when update unicode
On Wed, 27 Sep 2023 at 13:46, Michael Paquier wrote: > On Wed, Sep 27, 2023 at 09:15:00AM +0800, Japin Li wrote: >> On Wed, 27 Sep 2023 at 08:03, Michael Paquier wrote: >>> I am not sure that many people run this script frequently so that may >>> not be worth adding a check for a defined, still empty or incorrect >> >> Yeah, not frequently, however, it already be used by me, since we provide >> this >> function, why not make it better? > > Well, I don't mind doing as you suggest. > >>> value, but.. If you were to change the Makefile we use in this path, >>> how are you suggesting to change it? >> >> I provide a patch at bottom of in [1]. Attached here again. > > No file was attached so I somewhat missed it. And indeed, you're > right. The current rule is useless when compiling without > --with-python, as PYTHON is empty but defined. What you are proposing > is similar to what pgxs.mk does for bison and flex, and "python" would > still be able to map to python3 or python2.7, which should be OK in > most cases. > > I thought that this was quite old, but no, f85a485f8 was at the origin > of that. So I've applied the patch down to 13. Thanks for your review and push! -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Re: Invalidate the subscription worker in cases where a user loses their superuser status
On Wed, 27 Sept 2023 at 06:58, Peter Smith wrote: > > On Tue, Sep 26, 2023 at 11:57 PM vignesh C wrote: > > > > On Tue, 26 Sept 2023 at 13:03, Peter Smith wrote: > > > > > > Here are some comments for patch v2-0001. > > > > > > == > > > src/backend/replication/logical/worker.c > > > > > > 1. maybe_reread_subscription > > > > > > ereport(LOG, > > > (errmsg("logical replication worker for subscription \"%s\" > > > will restart because of a parameter change", > > > MySubscription->name))); > > > > > > Is this really a "parameter change" though? It might be a stretch to > > > call the user role change a subscription parameter change. Perhaps > > > this case needs its own LOG message? > > > > When I was doing this change the same thought had come to my mind too > > but later I noticed that in case of owner change there was no separate > > log message. Since superuser check is somewhat similar to owner > > change, I felt no need to make any change for this. > > > > Yeah, I had seen the same already before my comment. Anyway, YMMV. > > > > == > > > src/include/catalog/pg_subscription.h > > > > > > 2. > > > char*origin; /* Only publish data originating from the > > > * specified origin */ > > > + bool isownersuperuser; /* Is subscription owner superuser? */ > > > } Subscription; > > > > > > > > > Is a new Subscription field member really necessary? The Subscription > > > already has 'owner' -- why doesn't function > > > maybe_reread_subscription() just check: > > > > > > (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner)) > > > > We need the new variable so that we store this value when the worker > > is started and check the current value with the value that was when > > the worker was started. Since we need the value of the superuser > > status when the worker is started, I feel this variable is required. > > > > OK. In that case, then shouldn't the patch replace the other > superuser_arg() code already in function run_apply_worker() to make > use of this variable? Otherwise, there are 2 ways of getting the same > information. Modified > == > src/test/subscription/t/027_nosuperuser.pl > > I am suspicious that something may be wrong with either the code or > the test because while experimenting, I accidentally found that even > if I *completely* remove the important change below, the TAP test will > still pass anyway. > > - !equal(newsub->publications, MySubscription->publications)) > + !equal(newsub->publications, MySubscription->publications) || > + (!newsub->isownersuperuser && MySubscription->isownersuperuser)) The test did not wait for subscription to sync, as the owner has changed to non-superuser the tablesync were getting started later and failing with this error in HEAD. Now I have added wait for subscription to sync, so the error will always come from apply worker restart and also when the test is run on HEAD we can notice that the error message does not appear in the log. The v3 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm2JwPmX0rhkTyjGRQmZjwbKax%3D3Ytgw2KY9msDPPNGmBg%40mail.gmail.com Regards, Vignesh
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 04:51:29AM +, Zhijie Hou (Fujitsu) wrote: > While searching the code, I noticed one postgres fork where the PGoutputData > is > used in other places, although it's a separate fork, but it seems better to > discuss the removal separately. > > [1] > https://github.com/Tencent/TBase/blob/7cf7f8afbcab7290538ad5e65893561710be3dfa/src/backend/replication/walsender.c#L100 Indeed. Interesting. -- Michael signature.asc Description: PGP signature
Re: Could not run generate_unaccent_rules.py script when update unicode
On Wed, Sep 27, 2023 at 09:15:00AM +0800, Japin Li wrote: > On Wed, 27 Sep 2023 at 08:03, Michael Paquier wrote: >> I am not sure that many people run this script frequently so that may >> not be worth adding a check for a defined, still empty or incorrect > > Yeah, not frequently, however, it already be used by me, since we provide this > function, why not make it better? Well, I don't mind doing as you suggest. >> value, but.. If you were to change the Makefile we use in this path, >> how are you suggesting to change it? > > I provide a patch at bottom of in [1]. Attached here again. No file was attached so I somewhat missed it. And indeed, you're right. The current rule is useless when compiling without --with-python, as PYTHON is empty but defined. What you are proposing is similar to what pgxs.mk does for bison and flex, and "python" would still be able to map to python3 or python2.7, which should be OK in most cases. I thought that this was quite old, but no, f85a485f8 was at the origin of that. So I've applied the patch down to 13. -- Michael signature.asc Description: PGP signature
Re: Invalidate the subscription worker in cases where a user loses their superuser status
On Tue, 26 Sept 2023 at 13:03, Peter Smith wrote: > > Here are some comments for patch v2-0001. > == > src/test/subscription/t/027_nosuperuser.pl > > 3. > # The apply worker should get restarted after the superuser prvileges are > # revoked for subscription owner alice. > > typo > > /prvileges/privileges/ Modified > ~ > > 4. > +# After the user becomes non-superuser the apply worker should be restarted > and > +# it should fail with 'password is required' error as password option is not > +# part of the connection string. > > /as password option/because the password option/ Modified The attached v3 version patch has the changes for the same. Regards, Vignesh From f2828562ad7619a6d1fb1844c6afba546df840b5 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Fri, 22 Sep 2023 15:12:23 +0530 Subject: [PATCH v3] Restart the apply worker if the subscription owner has changed from superuser to non-superuser. Restart the apply worker if the subscription owner has changed from superuser to non-superuser. This is required so that the subscription connection string gets revalidated to identify cases where the password option is not specified as part of the connection string for non-superuser. --- src/backend/catalog/pg_subscription.c | 3 +++ src/backend/commands/subscriptioncmds.c | 4 ++-- src/backend/replication/logical/tablesync.c | 3 +-- src/backend/replication/logical/worker.c| 18 ++- src/include/catalog/pg_subscription.h | 1 + src/test/subscription/t/027_nosuperuser.pl | 25 + 6 files changed, 45 insertions(+), 9 deletions(-) diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c index d07f88ce28..bc74b695c6 100644 --- a/src/backend/catalog/pg_subscription.c +++ b/src/backend/catalog/pg_subscription.c @@ -108,6 +108,9 @@ GetSubscription(Oid subid, bool missing_ok) Anum_pg_subscription_suborigin); sub->origin = TextDatumGetCString(datum); + /* Get superuser for subscription owner */ + sub->isownersuperuser = superuser_arg(sub->owner); + ReleaseSysCache(tup); return sub; diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 6fe111e98d..ac57b150d7 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -869,7 +869,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data, load_file("libpqwalreceiver", false); /* Try to connect to the publisher. */ - must_use_password = !superuser_arg(sub->owner) && sub->passwordrequired; + must_use_password = !sub->isownersuperuser && sub->passwordrequired; wrconn = walrcv_connect(sub->conninfo, true, must_use_password, sub->name, ); if (!wrconn) @@ -1249,7 +1249,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt, load_file("libpqwalreceiver", false); /* Check the connection info string. */ walrcv_check_conninfo(stmt->conninfo, - sub->passwordrequired && !superuser_arg(sub->owner)); + sub->passwordrequired && !sub->isownersuperuser); values[Anum_pg_subscription_subconninfo - 1] = CStringGetTextDatum(stmt->conninfo); diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index e2cee92cf2..03e4d6adbd 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -1278,9 +1278,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) /* Is the use of a password mandatory? */ must_use_password = MySubscription->passwordrequired && - !superuser_arg(MySubscription->owner); + !MySubscription->isownersuperuser; - /* Note that the superuser_arg call can access the DB */ CommitTransactionCommand(); SpinLockAcquire(>relmutex); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 597947410f..f0e202928a 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -3942,7 +3942,8 @@ maybe_reread_subscription(void) * The launcher will start a new worker but note that the parallel apply * worker won't restart if the streaming option's value is changed from * 'parallel' to any other value or the server decides not to stream the - * in-progress transaction. + * in-progress transaction. Exit if the owner of the subscription has + * changed from superuser to a non-superuser. */ if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 || strcmp(newsub->name, MySubscription->name) != 0 || @@ -3952,7 +3953,8 @@ maybe_reread_subscription(void) newsub->passwordrequired != MySubscription->passwordrequired || strcmp(newsub->origin, MySubscription->origin) != 0 || newsub->owner != MySubscription->owner || - !equal(newsub->publications, MySubscription->publications)) + !equal(newsub->publications, MySubscription->publications) || + (!newsub->isownersuperuser &&
Re: pg_upgrade --check fails to warn about abstime
On Fri, Sep 22, 2023 at 4:44 PM Alvaro Herrera wrote: > On 2023-Sep-21, Tom Lane wrote: > > > Bruce Momjian writes: > > > > Wow, I never added code to pg_upgrade to check for that, and no one > > > complained either. > > > > Yeah, so most people had indeed listened to warnings and moved away > > from those datatypes. I'm inclined to think that adding code for this > > at this point is a bit of a waste of time. > > The migrations from versions prior to 12 have not stopped yet, and I did > receive a complaint about it. Because the change is so simple, I'm > inclined to patch it anyway, late though it is. > > I decided to follow Tristan's advice to add the version number as a > parameter to the new function; this way, the knowledge of where was what > dropped is all in the callsite and none in the function. It > looked a bit schizoid otherwise. > yeah, looks good to me. > > -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ > "Postgres is bloatware by design: it was built to house > PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002) > -- -- Thanks & Regards, Suraj kharage, edbpostgres.com
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 10:26 AM Michael Paquier wrote: > > On Wed, Sep 27, 2023 at 10:15:24AM +0530, Amit Kapila wrote: > > It's like that from the beginning. Now, even if we want to move, your > > suggestion is not directly related to this patch as we are just > > changing one field, and that too to fix a bug. We should start a > > separate thread to gather a broader consensus if we want to move the > > exposed structure to an internal file. > > As you wish. > Thanks. > > You are planning to take care of the patches 0001 and > 0002 posted on this thread, I guess? > I have tested and reviewed v2-0001-Maintain-publish_no_origin-in-output-plugin-priv and v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva patches posted in the email [1]. I'll push those unless there are more comments on them. I have briefly looked at v2-0002-Move-in_streaming-to-output-private-data in the same email [1] but didn't think about it in detail (like whether there is any live bug that can be fixed or is just an improvement). If you wanted to look and commit v2-0002-Move-in_streaming-to-output-private-data, I am fine with that? [1] - https://www.postgresql.org/message-id/OS0PR01MB57164B085332DB677DBFA8E994C3A%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 10:15:24AM +0530, Amit Kapila wrote: > It's like that from the beginning. Now, even if we want to move, your > suggestion is not directly related to this patch as we are just > changing one field, and that too to fix a bug. We should start a > separate thread to gather a broader consensus if we want to move the > exposed structure to an internal file. As you wish. You are planning to take care of the patches 0001 and 0002 posted on this thread, I guess? -- Michael signature.asc Description: PGP signature
RE: Move global variables of pgoutput to plugin private scope.
On Wednesday, September 27, 2023 12:45 PM Amit Kapila > > On Wed, Sep 27, 2023 at 9:46 AM Michael Paquier > wrote: > > > > On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote: > > > On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier > wrote: > > >> Err, actually, I am going to disagree here for the patch of HEAD. > > >> It seems to me that there is zero need for pgoutput.h and we don't > > >> need to show PGOutputData to the world. The structure is internal > > >> to Pgoutput.c and used only by its internal static routines. > > > > > > Do you disagree with the approach for the PG16 patch or HEAD? You > > > mentioned HEAD but your argument sounds like you disagree with a > > > different approach for PG16. > > > > Only HEAD where the structure should be moved from pgoutput.h to > > pgoutput.c, IMO. > > > > It's like that from the beginning. Now, even if we want to move, your > suggestion is not directly related to this patch as we are just changing one > field, > and that too to fix a bug. We should start a separate thread to gather a > broader > consensus if we want to move the exposed structure to an internal file. While searching the code, I noticed one postgres fork where the PGoutputData is used in other places, although it's a separate fork, but it seems better to discuss the removal separately. [1] https://github.com/Tencent/TBase/blob/7cf7f8afbcab7290538ad5e65893561710be3dfa/src/backend/replication/walsender.c#L100 Best Regards, Hou zj
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 9:46 AM Michael Paquier wrote: > > On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote: > > On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier wrote: > >> Err, actually, I am going to disagree here for the patch of HEAD. It > >> seems to me that there is zero need for pgoutput.h and we don't need > >> to show PGOutputData to the world. The structure is internal to > >> Pgoutput.c and used only by its internal static routines. > > > > Do you disagree with the approach for the PG16 patch or HEAD? You > > mentioned HEAD but your argument sounds like you disagree with a > > different approach for PG16. > > Only HEAD where the structure should be moved from pgoutput.h to > pgoutput.c, IMO. > It's like that from the beginning. Now, even if we want to move, your suggestion is not directly related to this patch as we are just changing one field, and that too to fix a bug. We should start a separate thread to gather a broader consensus if we want to move the exposed structure to an internal file. -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
Here are some more review comments for the patch v19-0002. This is a WIP these review comments are all for the file slotsync.c == src/backend/replication/logical/slotsync.c 1. wait_for_primary_slot_catchup + WalRcvExecResult *res; + TupleTableSlot *slot; + Oid slotRow[1] = {LSNOID}; + StringInfoData cmd; + bool isnull; + XLogRecPtr restart_lsn; + + for (;;) + { + int rc; I could not recognize a reason why 'rc' is declared within the loop, but none of the other local variables are. Personally, I'd declare all variables at the deepest scope (e.g. inside the for loop). ~~~ 2. get_local_synced_slot_names +/* + * Get list of local logical slot names which are synchronized from + * primary and belongs to one of the DBs passed in. + */ +static List * +get_local_synced_slot_names(Oid *dbids) +{ IIUC, this function gets called only from the drop_obsolete_slots() function. But I thought this list of local slot names (i.e. for the dbids that this worker is handling) would be something that perhaps could the initialized one time for the worker, instead of it being re-calculated every single time the slots processing/dropping happens. Isn't the current code expending too much effort recalculating over and over but giving back the same list every time? ~~~ 3. get_local_synced_slot_names + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = >replication_slots[i]; + + /* Check if it is logical synchronized slot */ + if (s->in_use && SlotIsLogical(s) && s->data.synced) + { + for (int j = 0; j < MySlotSyncWorker->dbcount; j++) + { Loop variables are not declared in the common PG code way. ~~~ 4. slot_exists_locally +static bool +slot_exists_locally(List *remote_slots, ReplicationSlot *local_slot, + bool *locally_invalidated) +{ + ListCell *cell; + + foreach(cell, remote_slots) + { + RemoteSlot *remote_slot = (RemoteSlot *) lfirst(cell); + + if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0) + { + /* + * if remote slot is marked as non-conflicting (i.e. not + * invalidated) but local slot is marked as invalidated, then set + * the bool. + */ + if (!remote_slot->conflicting && + SlotIsLogical(local_slot) && + local_slot->data.invalidated != RS_INVAL_NONE) + *locally_invalidated = true; + + return true; + } + } + + return false; +} Why is there a SlotIsLogical(local_slot) check buried in this function? How is slot_exists_locally() getting called with a non-logical local_slot? Shouldn't that have been screened out long before here? ~~~ 5. use_slot_in_query +static bool +use_slot_in_query(char *slot_name, Oid *dbids) There are multiple non-standard for-loop variable declarations in this function. ~~~ 6. compute_naptime + * The first slot managed by each worker is chosen for monitoring purpose. + * If the lsn of that slot changes during each sync-check time, then the + * nap time is kept at regular value of WORKER_DEFAULT_NAPTIME_MS. + * When no lsn change is observed for WORKER_INACTIVITY_THRESHOLD_MS + * time, then the nap time is increased to WORKER_INACTIVITY_NAPTIME_MS. + * This nap time is brought back to WORKER_DEFAULT_NAPTIME_MS as soon as + * lsn change is observed. 6a. /regular value/the regular value/ /for WORKER_INACTIVITY_THRESHOLD_MS time/within the threshold period (WORKER_INACTIVITY_THRESHOLD_MS)/ ~ 6b. /as soon as lsn change is observed./as soon as another lsn change is observed./ ~~~ 7. + * The caller is supposed to ignore return-value of 0. The 0 value is returned + * for the slots other that slot being monitored. + */ +static long +compute_naptime(RemoteSlot *remote_slot) This rule about the returning 0 seemed hacky to me. IMO this would be a better API to pass long *naptime (which this function either updates or doesn't update, depending on this being the "monitored" slot. Knowing the current naptime is also useful to improve the function logic (see the next review comment below). Also, since this function is really only toggling naptime between 2 values, it would be helpful to assert that Assert(*naptime == WORKER_DEFAULT_NAPTIME_MS || *naptime == WORKER_INACTIVITY_NAPTIME_MS); ~~~ 8. + if (NameStr(MySlotSyncWorker->monitoring_info.slot_name)[0] == '\0') + { + /* + * First time, just update the name and lsn and return regular + * nap time. Start comparison from next time onward. + */ + strcpy(NameStr(MySlotSyncWorker->monitoring_info.slot_name), +remote_slot->name); I wasn't sure why it was necessary to identify the "monitoring" slot by name. Why doesn't the compute_naptime just get called only for the 1st slot found in the tuple loop instead of all the strcmp business trying to match monitor names? And, if the monitored slot gets "dropped", then so what; next time another slot will be the first tuple so will automatically take its place, right? ~~~ 9. + /* + * If new received lsn (remote one) is different from what we have in + * our local slot, then update last_update_time. + */ + if
RE: [PoC] pg_upgrade: allow to upgrade publisher node
Dear Bharath, Thank you for reviewing! > Thanks for the new patch. Here's a comment on v46: > > 1. > +Datum > +binary_upgrade_validate_wal_logical_end(PG_FUNCTION_ARGS > +{ oid => '8046', descr => 'for use by pg_upgrade', > + proname => 'binary_upgrade_validate_wal_logical_end', proisstrict => 'f', > + provolatile => 'v', proparallel => 'u', prorettype => 'bool', > + proargtypes => 'pg_lsn', > + prosrc => 'binary_upgrade_validate_wal_logical_end' }, > > I think this patch can avoid catalog changes by turning > binary_upgrade_validate_wal_logical_end a FRONTEND-only function > sitting in xlogreader.c after making InitXLogReaderState(), > ReadNextXLogRecord() FRONTEND-friendly (replace elog/ereport with > pg_fatal or such). With this change and back-porting of commit > e0b2eed0 to save logical slots at shutdown, the patch can help support > upgrading logical replication slots on PG versions < 17. Hmm, I think your suggestion may be questionable. If we implement the upgrading function as FRONTEND-only (I have not checked its feasibility), it means pg_upgrade uses the latest version WAL reader API to read WALs in old version cluster, which I didn't think is suggested. Each WAL page header has a magic number, XLOG_PAGE_MAGIC, which indicates the content of WAL. Sometimes the value has been changed due to the changes of WAL contents, and some functions requires that the magic number must be same as expected. E.g., startup process and pg_walinspect functions require that. Typically XLogReaderValidatePageHeader() ensures the equality. Now some functions are ported from pg_walinspect, so upgrading function requires same restriction. I think we should not ease the restriction to verify the completeness of files. Followings are the call stack of ported functions till XLogReaderValidatePageHeader(). ``` InitXLogReaderState() XLogFindNextRecord() ReadPageInternal() XLogReaderValidatePageHeader() ``` ``` ReadNextXLogRecord() XLogReadRecord() XLogReadAhead() XLogDecodeNextRecord() ReadPageInternal() XLogReaderValidatePageHeader() ``` Best Regards, Hayato Kuroda FUJITSU LIMITED
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote: > On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier wrote: >> Err, actually, I am going to disagree here for the patch of HEAD. It >> seems to me that there is zero need for pgoutput.h and we don't need >> to show PGOutputData to the world. The structure is internal to >> Pgoutput.c and used only by its internal static routines. > > Do you disagree with the approach for the PG16 patch or HEAD? You > mentioned HEAD but your argument sounds like you disagree with a > different approach for PG16. Only HEAD where the structure should be moved from pgoutput.h to pgoutput.c, IMO. The proposed patch for PG16 is OK as the size of the structure should not change in a branch already released. -- Michael signature.asc Description: PGP signature
Re: Move global variables of pgoutput to plugin private scope.
On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier wrote: > > On Tue, Sep 26, 2023 at 01:55:10PM +, Zhijie Hou (Fujitsu) wrote: > > On Tuesday, September 26, 2023 4:40 PM Amit Kapila > > wrote: > >> Do we really need a new parameter in above structure? Can't we just use the > >> existing origin in the same structure? Please remember if this needs to be > >> backpatched then it may not be good idea to add new parameter in the > >> structure but apart from that having two members to represent similar > >> information doesn't seem advisable to me. I feel for backbranch we can > >> just use > >> PGOutputData->origin for comparison and for HEAD, we can remove origin > >> and just use a boolean to avoid any extra cost for comparisions for each > >> change. > > > > OK, I agree to remove the origin string on HEAD and we can add that back > > when we support other origin value. I also modified to use the string for > > comparison > > as suggested for back-branch. I will also test it locally to confirm it > > doesn't affect > > the perf. > > Err, actually, I am going to disagree here for the patch of HEAD. It > seems to me that there is zero need for pgoutput.h and we don't need > to show PGOutputData to the world. The structure is internal to > pgoutput.c and used only by its internal static routines. > Do you disagree with the approach for the PG16 patch or HEAD? You mentioned HEAD but your argument sounds like you disagree with a different approach for PG16. -- With Regards, Amit Kapila.
Re: pg_upgrade and logical replication
On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote: > On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier wrote: >> Sure, that's assuming that the publisher side is upgraded. > > At some point, user needs to upgrade publisher and subscriber could > itself have some publications defined which means the downstream > subscribers will have the same problem. Not always. I take it as a valid case that one may want to create a logical setup only for the sake of an upgrade, and trashes the publisher after a failover to an upgraded subscriber node after the latter has done a sync up of the data that's been added to the relations tracked by the publications while the subscriber was pg_upgrade'd. >>> This is the primary reason why I prioritized to work on the publisher >>> side before getting this patch done, otherwise, the solution for this >>> patch was relatively clear. I am not sure but I guess this could be >>> the reason why originally we left it in the current state, otherwise, >>> restoring subscription rel's or origin doesn't seem to be too much of >>> an additional effort than what we are doing now. >> >> By "additional effort", you are referring to what the patch is doing, >> with the binary dump of pg_subscription_rel, right? >> > > Yes. Okay. I'd like to move on with this stuff, then. At least it helps in maintaining data integrity when doing an upgrade with a logical setup. The patch still needs more polishing, though.. -- Michael signature.asc Description: PGP signature
Re: Move global variables of pgoutput to plugin private scope.
On Tue, Sep 26, 2023 at 01:55:10PM +, Zhijie Hou (Fujitsu) wrote: > On Tuesday, September 26, 2023 4:40 PM Amit Kapila > wrote: >> Do we really need a new parameter in above structure? Can't we just use the >> existing origin in the same structure? Please remember if this needs to be >> backpatched then it may not be good idea to add new parameter in the >> structure but apart from that having two members to represent similar >> information doesn't seem advisable to me. I feel for backbranch we can just >> use >> PGOutputData->origin for comparison and for HEAD, we can remove origin >> and just use a boolean to avoid any extra cost for comparisions for each >> change. > > OK, I agree to remove the origin string on HEAD and we can add that back > when we support other origin value. I also modified to use the string for > comparison > as suggested for back-branch. I will also test it locally to confirm it > doesn't affect > the perf. Err, actually, I am going to disagree here for the patch of HEAD. It seems to me that there is zero need for pgoutput.h and we don't need to show PGOutputData to the world. The structure is internal to pgoutput.c and used only by its internal static routines. Doing a codesearch in the Debian repos or just github shows that it is used nowhere else, as well, something not really surprising as the structure is filled and maintained in the file. -- Michael signature.asc Description: PGP signature
Re: Index AmInsert Parameter Confused?
> 2023年9月27日 00:45,Matthias van de Meent 写道: > > On Tue, 26 Sept 2023 at 18:38, jacktby jacktby wrote: >> >> typedef bool (*aminsert_function) (Relation indexRelation, >> Datum *values, >> bool *isnull, >> ItemPointer heap_tid, >> Relation heapRelation, >> IndexUniqueCheck checkUnique, >> bool indexUnchanged, >> struct IndexInfo *indexInfo); >> >> Why is there a heap_tid, We haven’t inserted the value, so where does it >> from ? > > Index insertion only happens after the TableAM tuple has been > inserted. As indexes refer to locations in the heap, this TID contains > the TID of the table tuple that contains the indexed values, so that > the index knows which tuple to refer to. > > Note that access/amapi.h describes only index AM APIs; it does not > cover the table AM APIs descibed in access/tableam.h > > Kind regards, > > Matthias van de Meent 1.Thanks, so if we insert a tuple into a table which has a index on itself, pg will insert tuple into heap firstly, and the give the heaptid form heap to the Index am api right? 2. I’m trying to implement a new index, but I just need the data held in index table, I hope it’s not inserted into heap, because the all data I want can be in index table.
Re: Invalidate the subscription worker in cases where a user loses their superuser status
On Tue, Sep 26, 2023 at 11:57 PM vignesh C wrote: > > On Tue, 26 Sept 2023 at 13:03, Peter Smith wrote: > > > > Here are some comments for patch v2-0001. > > > > == > > src/backend/replication/logical/worker.c > > > > 1. maybe_reread_subscription > > > > ereport(LOG, > > (errmsg("logical replication worker for subscription \"%s\" > > will restart because of a parameter change", > > MySubscription->name))); > > > > Is this really a "parameter change" though? It might be a stretch to > > call the user role change a subscription parameter change. Perhaps > > this case needs its own LOG message? > > When I was doing this change the same thought had come to my mind too > but later I noticed that in case of owner change there was no separate > log message. Since superuser check is somewhat similar to owner > change, I felt no need to make any change for this. > Yeah, I had seen the same already before my comment. Anyway, YMMV. > > == > > src/include/catalog/pg_subscription.h > > > > 2. > > char*origin; /* Only publish data originating from the > > * specified origin */ > > + bool isownersuperuser; /* Is subscription owner superuser? */ > > } Subscription; > > > > > > Is a new Subscription field member really necessary? The Subscription > > already has 'owner' -- why doesn't function > > maybe_reread_subscription() just check: > > > > (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner)) > > We need the new variable so that we store this value when the worker > is started and check the current value with the value that was when > the worker was started. Since we need the value of the superuser > status when the worker is started, I feel this variable is required. > OK. In that case, then shouldn't the patch replace the other superuser_arg() code already in function run_apply_worker() to make use of this variable? Otherwise, there are 2 ways of getting the same information. == src/test/subscription/t/027_nosuperuser.pl I am suspicious that something may be wrong with either the code or the test because while experimenting, I accidentally found that even if I *completely* remove the important change below, the TAP test will still pass anyway. - !equal(newsub->publications, MySubscription->publications)) + !equal(newsub->publications, MySubscription->publications) || + (!newsub->isownersuperuser && MySubscription->isownersuperuser)) == Kind Regards, Peter Smith. Fujitsu Australia
Re: Fail hard if xlogreader.c fails on out-of-memory
On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote: > On Tue, Sep 26, 2023 at 8:38 PM Michael Paquier wrote: > > Thoughts and/or comments are welcome. > > I don't have an opinion yet on your other thread about making this > stuff configurable for replicas, but for the simple crash recovery > case shown here, hard failure makes sense to me. > Recycled pages can't fool us into making a huge allocation any more. > If xl_tot_len implies more than one page but the next page's > xlp_pageaddr is too low, then either the xl_tot_len you read was > recycled garbage bits, or it was legitimate but the overwrite of the > following page didn't make it to disk; either way, we don't have a > record, so we have an end-of-wal condition. The xlp_rem_len check > defends against the second page making it to disk while the first one > still contains recycled garbage where the xl_tot_len should be*. > > What Michael wants to do now is remove the 2004-era assumption that > malloc failure implies bogus data. It must be pretty unlikely in a 64 > bit world with overcommitted virtual memory, but a legitimate > xl_tot_len can falsely end recovery and lose data, as reported from a > production case analysed by his colleagues. In other words, we can > actually distinguish between lack of resources and recycled bogus > data, so why treat them the same? Indeed. Hard failure is fine, and ENOMEM=end-of-WAL definitely isn't. > *A more detailed analysis would talk about sectors (page header is > atomic) I think the page header is atomic on POSIX-compliant filesystems but not atomic on ext4. That doesn't change the conclusion on $SUBJECT.
Re: Could not run generate_unaccent_rules.py script when update unicode
On Wed, 27 Sep 2023 at 08:03, Michael Paquier wrote: > On Tue, Sep 26, 2023 at 10:43:40AM +0800, Japin Li wrote: >> # Allow running this even without --with-python >> PYTHON ?= python >> >> $(srcdir)/unaccent.rules: generate_unaccent_rules.py >> ../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml >> $(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file >> $(word 3,$^) >$@ >> >> It use python to run generate_unaccent_rules.py, However, the ?= operator in >> Makefile only check variable is defined or not, but do not check variable is >> empty. Since the PYTHON is defined in src/Makefile.global, so here PYTHON >> get empty when without --with-ptyhon. > > I am not sure that many people run this script frequently so that may > not be worth adding a check for a defined, still empty or incorrect Yeah, not frequently, however, it already be used by me, since we provide this function, why not make it better? > value, but.. If you were to change the Makefile we use in this path, > how are you suggesting to change it? I provide a patch at bottom of in [1]. Attached here again. diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile index 652a3e774c..3ff49ba1e9 100644 --- a/contrib/unaccent/Makefile +++ b/contrib/unaccent/Makefile @@ -26,7 +26,9 @@ endif update-unicode: $(srcdir)/unaccent.rules # Allow running this even without --with-python -PYTHON ?= python +ifeq ($(PYTHON),) +PYTHON = python +endif $(srcdir)/unaccent.rules: generate_unaccent_rules.py ../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml $(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file $(word 3,$^) >$@ [1] https://www.postgresql.org/message-id/meyp282mb1669f86c0dc7b4dc48489cb0b6...@meyp282mb1669.ausp282.prod.outlook.com -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Re: Annoying build warnings from latest Apple toolchain
I wrote: > Since updating to Xcode 15.0, my macOS machines have been > spitting a bunch of linker-generated warnings. There > seems to be one instance of > ld: warning: -multiply_defined is obsolete > for each loadable module we link ... I poked into this a little more. We started using "-multiply_defined suppress" in commit 9df308697 of 2004-07-13, which was in the OS X 10.3 era. I failed to find any specific discussion of that switch in our archives, but the commit message suggests that I probably stole it from a patch the Fink project was carrying. Googling finds some non-authoritative claims that "-multiply_defined" has been a no-op since OS X 10.9 (Mavericks). I don't have anything older than 10.15 to check, but removing it on 10.15 does not seem to cause any problems. So I think we can safely just remove this switch from Makefile.shlib. The meson build process isn't invoking it either I think. The other thing will take a bit more work ... regards, tom lane
Re: Could not run generate_unaccent_rules.py script when update unicode
On Tue, Sep 26, 2023 at 10:43:40AM +0800, Japin Li wrote: > # Allow running this even without --with-python > PYTHON ?= python > > $(srcdir)/unaccent.rules: generate_unaccent_rules.py > ../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml > $(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file > $(word 3,$^) >$@ > > It use python to run generate_unaccent_rules.py, However, the ?= operator in > Makefile only check variable is defined or not, but do not check variable is > empty. Since the PYTHON is defined in src/Makefile.global, so here PYTHON > get empty when without --with-ptyhon. I am not sure that many people run this script frequently so that may not be worth adding a check for a defined, still empty or incorrect value, but.. If you were to change the Makefile we use in this path, how are you suggesting to change it? -- Michael signature.asc Description: PGP signature
Re: XLog size reductions: Reduced XLog record header size for PG17
On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote: > On Wed, 20 Sept 2023 at 07:06, Michael Paquier wrote: >> #define COPY_HEADER_FIELD(_dst, _size)\ >> do {\ >> -if (remaining < _size)\ >> +if (remaining < (_size))\ >> goto shortdata_err;\ >> >> There are a couple of stylistic changes like this one, that I guess >> could just use their own patch to make these macros easier to use. > > They actually fix complaints of my IDE, but are otherwise indeed stylistic. Oh, OK. I just use an old-school terminal, but no objections in changing these if they make life easier for some hackers. Still, that feels independant of what you are proposing here. -- Michael signature.asc Description: PGP signature
Re: Correct the documentation for work_mem
On Wed, Sep 27, 2023 at 02:05:44AM +1300, David Rowley wrote: > On Tue, 12 Sept 2023 at 03:03, Bruce Momjian wrote: > > > > On Mon, Sep 11, 2023 at 10:02:55PM +1200, David Rowley wrote: > > > It's certainly not a show-stopper. I do believe the patch makes some > > > improvements. The reason I'd prefer to see either "and" or "and/or" > > > in place of "or" is because the text is trying to imply that many of > > > these operations can run at the same time. I'm struggling to > > > understand why, given that there could be many sorts and many hashes > > > going on at once that we'd claim it could only be one *or* the other. > > > If we have 12 sorts and 4 hashes then that's not "several sort or hash > > > operations", it's "several sort and hash operations". Of course, it > > > could just be sorts or just hashes, so "and/or" works fine for that. > > > > Yes, I see your point and went with "and", updated patch attached. > > Looks good to me. Patch applied back to Postgres 11. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: [PoC/RFC] Multiple passwords, interval expirations
On Mon, 2023-09-25 at 00:31 -0700, Gurjeet Singh wrote: > Please see attached v4 of the patch. The patch takes care of rebase > to > the master/17-devel branch, and includes some changes, too. FWIW I got some failures applying. I didn't investigate much, and instead I looked at your git branch (7a35619e). > Moreover, before the patch, in case of CheckPasswordAuth(), the error > (if any) would have been thrown _after_ network communication done by > sendAuthRequest() call. But with this patch, the error is thrown > before the network interaction, hence this changes the order of > network interaction and the error message. This may have security > implications, too, but I'm unable to articulate one right now. You mean before v3 or before v4? Is this currently a problem in v4? > Open question: If a client is capable of providing just md5 passwords > handshake, and because of pg_hba.conf setting, or because the role > has > at least one SCRAM password (essentially the 3rd case you mention > above: pg_hba md5 + md5 and scram pws -> scram), the server will > respond with a SASL/SCRAM authentication response, and that would > break the backwards compatibility and will deny access to the client. > Does this make it necessary to use a newer libpq/client library? Perhaps you can try the MD5 passwords first, and only if they fail, move on to try scram passwords? > Comments? IIUC, for the case of multiple scram passwords, we use the salt to select the right scram password, and then proceed from there? I'm not very excited about the idea of naming passwords, or having passwords with default names. I can't think of anything better right now, so it might be OK. > - Add tests > - Add/update documentation These are needed to provide better review. Regards, Jeff Davis
Re: pg_rewind with cascade standby doesn't work well
On Tue, Sep 26, 2023 at 06:44:50PM +0300, Aleksander Alekseev wrote: >> And also, I'm afraid that I'm not sure what kind of tests I have to make >> for fix this behavior. Would you mind giving me some advice? > > Personally I would prefer not to increase the scope of work. Your TAP > test added in 0001 seems to be adequate. Yeah, agreed. I'm OK with what you are proposing, basically (the test could be made a bit cheaper actually). /* - * For Hot Standby, we could treat this like a Shutdown Checkpoint, - * but this case is rarer and harder to test, so the benefit doesn't - * outweigh the potential extra cost of maintenance. + * For Hot Standby, we could treat this like an end-of-recovery checkpoint */ +RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT); I don't understand what you want to change here. Archive recovery and crash recovery are two different things, still this code would be triggered even if there is no standby.signal, aka the node is not a standby. Why isn't this stuff conditional? >> BTW, I was able to >> reproduce the assertion failure Kuwamura-san reported, even after applying >> your latest patch from the thread. > > Do you mean that the test fails or it doesn't but there are other > steps to reproduce the issue? I get it as Fujii-san testing the patch from [1], still failing the test from [2]: [1]: https://www.postgresql.org/message-id/zarvomifjze7f...@paquier.xyz [2]: https://www.postgresql.org/message-id/camyc8qrye7mkyvpvghct5gpanamp8ss_trbraqxcpbx14vi...@mail.gmail.com I would be surprised, actually, because the patch from [1] would cause step 7 of the test to fail: the patch causes standby.signal or recovery.signal to be required. Anyway, this specific issue, if any, had better be discussed on the other thread. I need to address a few comments there as well and was planning to get back to it. It is possible that I've missed something on the other thread with the restrictions I was proposing in the latest version of the patch. For this thread, let's focus on the pg_rewind case and how we want to treat these records to improve the cascading case. -- Michael signature.asc Description: PGP signature
Re: Fail hard if xlogreader.c fails on out-of-memory
On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote: > I don't have an opinion yet on your other thread about making this > stuff configurable for replicas, but for the simple crash recovery > case shown here, hard failure makes sense to me. Also, if we conclude that we're OK with just failing hard all the time for crash recovery and archive recovery on OOM, the other patch is not really required. That would be disruptive for standbys in some cases, still perhaps OK in the long-term. I am wondering if people have lost data because of this problem on production systems, actually.. It would not be possible to know that it happened until you see a page on disk that has a somewhat valid LSN, still an LSN older than the position currently being inserted, and that could show up in various forms. Even that could get hidden quickly if WAL is written at a fast pace after a crash recovery. A standby promotion at an LSN older would be unlikely as monitoring solutions discard standbys lagging behind N bytes. > *A more detailed analysis would talk about sectors (page header is > atomic), and consider whether we're only trying to defend ourselves > against recycled pages written by PostgreSQL (yes), arbitrary random > data (no, but it's probably still pretty good) or someone trying to > trick us (no, and we don't stand a chance). WAL would not be the only part of the system that would get borked if arbitrary bytes can be inserted into what's read from disk, random or not. -- Michael signature.asc Description: PGP signature
Re: Is this a problem in GenericXLogFinish()?
On Wed, 2023-09-27 at 00:14 +0300, Heikki Linnakangas wrote: > Looks correct. You now loop through all the block IDs three times, > however. I wonder if that is measurably slower, but even if it's not, > was there a reason you wanted to move the XLogRegisterBuffer() calls > to > a separate loop? I did so to correspond more closely to what's outlined in the README and in other places in the code, where marking the buffers dirty happens before XLogBeginInsert(). It didn't occur to me that one extra loop would matter, but I can combine them again. It would be a bit more concise to do the XLogBeginInsert() first (like before) and then register the buffers in the same loop that does the writes and marks the buffers dirty. Updated patch attached. > > Hmm, I'm sure there are exceptions but log_newpage_range() actually > seems to be doing the right thing; it calls MarkBufferDirty() before > XLogInsert(). It only calls it after XLogRegisterBuffer() though, and > I > concur that XLogRegisterBuffer() would be the logical place for the > assertion. We could tighten this up, require that you call > MarkBufferDirty() before XLogRegisterBuffer(), and fix all the > callers. That site is pretty trivial to fix, but there are also a couple places in hash.c and hashovfl.c that are registering a clean page and it's not clear to me exactly what's going on. Regards, Jeff Davis From 9548bb49865db3a04bbdda4c63df611142e22964 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Tue, 26 Sep 2023 12:10:11 -0700 Subject: [PATCH v2] Fix GenericXLogFinish(). Mark the buffers dirty before writing WAL. Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c0...@iki.fi --- src/backend/access/transam/generic_xlog.c | 53 ++- 1 file changed, 23 insertions(+), 30 deletions(-) diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c index 6c68191ca6..2d5ff4aa7c 100644 --- a/src/backend/access/transam/generic_xlog.c +++ b/src/backend/access/transam/generic_xlog.c @@ -347,6 +347,10 @@ GenericXLogFinish(GenericXLogState *state) START_CRIT_SECTION(); + /* + * Compute deltas if necessary, write changes to buffers, mark + * buffers dirty, and register changes. + */ for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++) { PageData *pageData = >pages[i]; @@ -359,41 +363,31 @@ GenericXLogFinish(GenericXLogState *state) page = BufferGetPage(pageData->buffer); pageHeader = (PageHeader) pageData->image; + /* delta not needed for a full page image */ + if (!(pageData->flags & GENERIC_XLOG_FULL_IMAGE)) +computeDelta(pageData, page, (Page) pageData->image); + + /* + * Apply the image, being careful to zero the "hole" between + * pd_lower and pd_upper in order to avoid divergence between + * actual page state and what replay would produce. + */ + memcpy(page, pageData->image, pageHeader->pd_lower); + memset(page + pageHeader->pd_lower, 0, + pageHeader->pd_upper - pageHeader->pd_lower); + memcpy(page + pageHeader->pd_upper, + pageData->image + pageHeader->pd_upper, + BLCKSZ - pageHeader->pd_upper); + + MarkBufferDirty(pageData->buffer); + if (pageData->flags & GENERIC_XLOG_FULL_IMAGE) { -/* - * A full-page image does not require us to supply any xlog - * data. Just apply the image, being careful to zero the - * "hole" between pd_lower and pd_upper in order to avoid - * divergence between actual page state and what replay would - * produce. - */ -memcpy(page, pageData->image, pageHeader->pd_lower); -memset(page + pageHeader->pd_lower, 0, - pageHeader->pd_upper - pageHeader->pd_lower); -memcpy(page + pageHeader->pd_upper, - pageData->image + pageHeader->pd_upper, - BLCKSZ - pageHeader->pd_upper); - XLogRegisterBuffer(i, pageData->buffer, REGBUF_FORCE_IMAGE | REGBUF_STANDARD); } else { -/* - * In normal mode, calculate delta and write it as xlog data - * associated with this page. - */ -computeDelta(pageData, page, (Page) pageData->image); - -/* Apply the image, with zeroed "hole" as above */ -memcpy(page, pageData->image, pageHeader->pd_lower); -memset(page + pageHeader->pd_lower, 0, - pageHeader->pd_upper - pageHeader->pd_lower); -memcpy(page + pageHeader->pd_upper, - pageData->image + pageHeader->pd_upper, - BLCKSZ - pageHeader->pd_upper); - XLogRegisterBuffer(i, pageData->buffer, REGBUF_STANDARD); XLogRegisterBufData(i, pageData->delta, pageData->deltaLen); } @@ -402,7 +396,7 @@ GenericXLogFinish(GenericXLogState *state) /* Insert xlog record */ lsn = XLogInsert(RM_GENERIC_ID, 0); - /* Set LSN and mark buffers dirty */ + /* Set LSN */ for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++) { PageData *pageData = >pages[i]; @@ -410,7 +404,6 @@ GenericXLogFinish(GenericXLogState *state) if
Re: Document that server will start even if it's unable to open some TCP/IP ports
On Tue, Sep 12, 2023 at 05:25:44PM -0700, Gurjeet Singh wrote: > On Fri, Sep 8, 2023 at 7:52 AM Bruce Momjian wrote: > > > > On Thu, Sep 7, 2023 at 09:21:07PM -0700, Nathan Bossart wrote: > > > On Thu, Sep 07, 2023 at 07:13:44PM -0400, Bruce Momjian wrote: > > > > On Thu, Sep 7, 2023 at 02:54:13PM -0700, Nathan Bossart wrote: > > > >> IMO the phrase "open a port" is kind of nonstandard. I think we > > > >> should say > > > >> something along the lines of > > > >> > > > >>If listen_addresses is not empty, the server will start only if it > > > >> can > > > >>listen on at least one of the specified addresses. A warning will > > > >> be > > > >>emitted for any addresses that the server cannot listen on. > > > > > > > > Good idea, updated patch attached. > > > > > > I still think we should say "listen on an address" instead of "open a > > > port," but otherwise it LGTM. > > > > Agreed, I never liked the "port" mention. I couldn't figure how to get > > "open" out of the warning sentence though. Updated patch attached. > > LGTM. Patch applied back to PG 11. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?
On Thu, Sep 7, 2023 at 01:52:45PM -0400, Bruce Momjian wrote: > On Mon, Jul 10, 2023 at 02:37:24PM -0700, Nikolay Samokhvalov wrote: > > Maybe. It will require changes in other parts of this doc. > > Thinking (here: > > https://gitlab.com/postgres/postgres/-/merge_requests/18/diffs) > > > > Meanwhile, attached is v2 > > > > thanks for the comments > > I looked over this issue thoroughly and I think I see the cause of the > confusion. In step 8 we say: > > 8. Stop both servers > Streaming replication and log-shipping standby servers can remain > --- > running until a later step. > > Of course this has to be "must" and it would be good to explain why, > which I have done in the attached patch. > > Secondly, in step 9 we say "verify the LSNs", but have a parenthetical > sentence that explains why they might not match: > > (There will be a mismatch if old standby servers were shut down before > the old primary or if the old standby servers are still running.) > > People might take that to mean that it is okay if this is the reason > they don't match, which is incorrect. Better to tell them to keep the > streaming replication and log-shipping servers running so we don't need > that sentence. > > The instructions are already long so I am hesitant to add more text > without a clear purpose. Patch applied back to PG 11. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Annoying build warnings from latest Apple toolchain
Since updating to Xcode 15.0, my macOS machines have been spitting a bunch of linker-generated warnings. There seems to be one instance of ld: warning: -multiply_defined is obsolete for each loadable module we link, and some program links complain ld: warning: ignoring duplicate libraries: '-lpgcommon', '-lpgport' You can see these in the build logs for both sifaka [1] and indri [2], so MacPorts isn't affecting it. I'd held out some hope that this was a transient problem due to the underlying OS still being Ventura, but I just updated another machine to shiny new Sonoma (14.0), and it's still doing it. Guess we gotta do something about it. We used to need "-multiply_defined suppress" to suppress other linker warnings. I tried removing that from the Makefile.shlib recipes for Darwin, and those complaints go away while no new ones appear, so that's good --- but I wonder whether slightly older toolchain versions will still want it. As for the duplicate libraries, yup guilty as charged, but I think we were doing that intentionally to satisfy some other old toolchains. I wonder whether removing the duplication will create new problems. regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sifaka=2023-09-26%2021%3A09%3A01=build [2] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=indri=2023-09-26%2021%3A42%3A17=build
Re: Fail hard if xlogreader.c fails on out-of-memory
On Tue, Sep 26, 2023 at 8:38 PM Michael Paquier wrote: > Thoughts and/or comments are welcome. I don't have an opinion yet on your other thread about making this stuff configurable for replicas, but for the simple crash recovery case shown here, hard failure makes sense to me. Here are some interesting points in the history of this topic: 1999 30659d43: xl_len is 16 bit, fixed size buffer in later commits 2001 7d4d5c00: WAL files recycled, xlp_pageaddr added 2004 0ffe11ab: xl_len is 32 bit, dynamic buffer, malloc() failure ends redo 2005 21fda22e: xl_tot_len and xl_len co-exist 2014 2c03216d: xl_tot_len fully replaces xl_len 2018 70b4f82a: xl_tot_len > 1GB ends redo 2023 8fcb32db: don't let xl_tot_len > 1GB be logged! 2023 bae868ca: check next xlp_pageaddr, xlp_rem_len before allocating Recycled pages can't fool us into making a huge allocation any more. If xl_tot_len implies more than one page but the next page's xlp_pageaddr is too low, then either the xl_tot_len you read was recycled garbage bits, or it was legitimate but the overwrite of the following page didn't make it to disk; either way, we don't have a record, so we have an end-of-wal condition. The xlp_rem_len check defends against the second page making it to disk while the first one still contains recycled garbage where the xl_tot_len should be*. What Michael wants to do now is remove the 2004-era assumption that malloc failure implies bogus data. It must be pretty unlikely in a 64 bit world with overcommitted virtual memory, but a legitimate xl_tot_len can falsely end recovery and lose data, as reported from a production case analysed by his colleagues. In other words, we can actually distinguish between lack of resources and recycled bogus data, so why treat them the same? For comparison, if you run out of disk space during recovery we don't say "oh well, that's enough redoing for today, the computer is full, let's forget about the rest of the WAL and start accepting new transactions!". The machine running recovery has certain resource requirements relative to the machine that generated the WAL, and if they're not met it just can't do it. It's the same if it various other allocations fail. The new situation is that we are now verifying that xl_tot_len was actually logged by PostgreSQL, so if we can't allocate space for it, we can't replay the WAL. *A more detailed analysis would talk about sectors (page header is atomic), and consider whether we're only trying to defend ourselves against recycled pages written by PostgreSQL (yes), arbitrary random data (no, but it's probably still pretty good) or someone trying to trick us (no, and we don't stand a chance).
Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
Tomas Vondra writes: > Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure: Huh. I'm baffled as to what's up there. Is it possible that this is actually a hardware-based difference? I didn't think there was much difference between Pi 3B and Pi 4, but we're running out of other explanations. > I wonder what's the difference between the systems ... All I did was > writing the BETA2 image to SD card, and install a couple packages: I reinstalled BETA3, since that's out now, but see no change in behavior. I did discover that plperl works for me after adding --with-openssl to the configure options. Not sure if it's worth digging any further than that. regards, tom lane
Re: [PGdocs] fix description for handling pf non-ASCII characters
Sep 26, 2023 1:10:55 PM Tom Lane : > "Karl O. Pinc" writes: >> For the last hunk you'd change around "anything". Write: >> "... it will be truncated to less than NAMEDATALEN characters and >> the bytes of the string which are not printable ASCII characters ...". > >> Notice that I have also changed "that" to "which" just above. >> I _think_ this is better English. > > No, I'm pretty sure you're mistaken. It's been a long time since > high school English, but the way I think this works is that "that" > introduces a restrictive clause, which narrows the scope of what > you are saying. That is, you say "that" when you want to talk > about only the bytes of the string that aren't ASCII. But "which" > introduces a non-restrictive clause that adds information or > commentary. If you say "bytes of the string which are not ASCII", > you are effectively making a side assertion that no byte of the > string is ASCII. Which is not the meaning you want here. Makes sense to me. "That" it is. Thanks for the help. I never would have figured that out.
Re: [PATCH] pgrowlocks: Make mode names consistent with docs
On Thu, Sep 7, 2023 at 12:58:29PM -0400, Bruce Momjian wrote: > You are right something is wrong. However, I looked at your patch and I > am thinking we need to go the other way and add "For" in the upper > block, rather than removing it in the lower one. I have two reasons. > Looking at the code block: > > case MultiXactStatusUpdate: > snprintf(buf, NCHARS, "Update"); > break; > case MultiXactStatusNoKeyUpdate: > snprintf(buf, NCHARS, "No Key Update"); > break; > case MultiXactStatusForUpdate: > snprintf(buf, NCHARS, "For Update"); > break; > case MultiXactStatusForNoKeyUpdate: > snprintf(buf, NCHARS, "For No Key Update"); > break; > case MultiXactStatusForShare: > snprintf(buf, NCHARS, "Share"); > break; > case MultiXactStatusForKeyShare: > snprintf(buf, NCHARS, "Key Share"); > break; > > You will notice there are "For" and non-"For" versions of "Update" and > "No Key Update". Notice that "For" appears in the macro names for the > "For" macro versions of update, but "For" does not appear in the "Share" > and "Key Share" versions, though the macro has "For". > > Second, notice that the "For" and non-"For" match in the block below > that, which suggests it is correct, and the non-"For" block is later: > > values[Atnum_modes] = palloc(NCHARS); > if (infomask & HEAP_XMAX_LOCK_ONLY) > { > if (HEAP_XMAX_IS_SHR_LOCKED(infomask)) > snprintf(values[Atnum_modes], NCHARS, "{For Share}"); > else if (HEAP_XMAX_IS_KEYSHR_LOCKED(infomask)) > snprintf(values[Atnum_modes], NCHARS, "{For Key Share}"); > else if (HEAP_XMAX_IS_EXCL_LOCKED(infomask)) > { > if (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED) > snprintf(values[Atnum_modes], NCHARS, "{For Update}"); > else > snprintf(values[Atnum_modes], NCHARS, "{For No Key Update}"); > } > else > /* neither keyshare nor exclusive bit it set */ > snprintf(values[Atnum_modes], NCHARS, >"{transient upgrade status}"); > } > else > { > if (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED) > snprintf(values[Atnum_modes], NCHARS, "{Update}"); > else > snprintf(values[Atnum_modes], NCHARS, "{No Key Update}"); > } > > I therefore suggest this attached patch, which should be marked as an > incompatibility in PG 17. Patch applied to master. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: Better help output for pgbench -I init_steps
On 22.09.23 22:01, Tristen Raab wrote: The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: tested, passed Spec compliant: not tested Documentation:not tested Hello, I've reviewed all 4 of your patches, each one applies and builds correctly. I think I prefer variant 2. Currently, we only have 8 steps, so it might be overkill to separate them out into a different option. +1 to this from Peter. Variant 2 is nicely formatted with lots of information which I feel better solves the problem this patch is trying to address. Committed variant 2. I just changed the initial capitalization of the sentences to be more consistent with the surrounding text.
Re: Is this a problem in GenericXLogFinish()?
On 26/09/2023 22:32, Jeff Davis wrote: On Mon, 2023-09-25 at 13:04 +0300, Heikki Linnakangas wrote: Yes, that's a problem. Patch attached. I rearranged the code a bit to follow the expected pattern of: write, mark dirty, WAL, set LSN. I think computing the deltas could also be moved earlier, outside of the critical section, but I'm not sure that would be useful. Looks correct. You now loop through all the block IDs three times, however. I wonder if that is measurably slower, but even if it's not, was there a reason you wanted to move the XLogRegisterBuffer() calls to a separate loop? Do you have a suggestion for any kind of test addition, or should we just review carefully? I wish we had an assertion for that. XLogInsert() could assert that the page is already marked dirty, for example. Unfortunately that's not always the case, e.g. log_newpage_range(). Hmm, I'm sure there are exceptions but log_newpage_range() actually seems to be doing the right thing; it calls MarkBufferDirty() before XLogInsert(). It only calls it after XLogRegisterBuffer() though, and I concur that XLogRegisterBuffer() would be the logical place for the assertion. We could tighten this up, require that you call MarkBufferDirty() before XLogRegisterBuffer(), and fix all the callers. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Obsolete reference to pg_relation in comment
On Thu, Sep 7, 2023 at 10:44:25AM +0200, Daniel Gustafsson wrote: > > On 6 Sep 2023, at 21:13, Bruce Momjian wrote: > > On Wed, Jul 26, 2023 at 05:14:08PM -0400, Tom Lane wrote: > > >> I think we should reword this to just generically claim that holding > >> the Relation reference open for the whole transaction reduces overhead. > > > > How is this attached patch? > > Reads good to me, +1. Patch applied to master. I didn't think backpatching it made much sense since it is so localized. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: Add const qualifiers
On 9/26/23 06:34, Peter Eisentraut wrote: On 09.09.23 21:03, David Steele wrote: On 9/1/23 11:39, David Steele wrote: Hackers, I noticed that there was a mismatch between the const qualifiers for excludeDirContents in src/backend/backup/backup/basebackup.c and src/bin/pg_rewind/file_map.c and that led me to use ^static const.*\*.*= to do a quick search for similar cases. I think at the least we should make excludeDirContents match, but the rest of the changes seem like a good idea as well. Added to 2023-11 CF. committed Thank you, Peter!
Re: Is this a problem in GenericXLogFinish()?
On Mon, 2023-09-25 at 13:04 +0300, Heikki Linnakangas wrote: > Yes, that's a problem. Patch attached. I rearranged the code a bit to follow the expected pattern of: write, mark dirty, WAL, set LSN. I think computing the deltas could also be moved earlier, outside of the critical section, but I'm not sure that would be useful. Do you have a suggestion for any kind of test addition, or should we just review carefully? > I wish we had an assertion for that. XLogInsert() could assert that > the > page is already marked dirty, for example. Unfortunately that's not always the case, e.g. log_newpage_range(). Regards, Jeff Davis From c54749a7939721f0a838115abce09967216cbc0f Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Tue, 26 Sep 2023 12:10:11 -0700 Subject: [PATCH v1] Fix GenericXLogFinish(). Mark the buffers dirty before writing WAL. Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c0...@iki.fi --- src/backend/access/transam/generic_xlog.c | 66 --- 1 file changed, 34 insertions(+), 32 deletions(-) diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c index 6c68191ca6..2b7e054ebe 100644 --- a/src/backend/access/transam/generic_xlog.c +++ b/src/backend/access/transam/generic_xlog.c @@ -343,10 +343,12 @@ GenericXLogFinish(GenericXLogState *state) if (state->isLogged) { /* Logged relation: make xlog record in critical section. */ - XLogBeginInsert(); - START_CRIT_SECTION(); + /* + * Compute deltas if necessary, write changes to buffers, and mark + * them dirty. + */ for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++) { PageData *pageData = >pages[i]; @@ -359,41 +361,42 @@ GenericXLogFinish(GenericXLogState *state) page = BufferGetPage(pageData->buffer); pageHeader = (PageHeader) pageData->image; + /* delta not needed for a full page image */ + if (!(pageData->flags & GENERIC_XLOG_FULL_IMAGE)) +computeDelta(pageData, page, (Page) pageData->image); + + /* + * Apply the image, being careful to zero the "hole" between + * pd_lower and pd_upper in order to avoid divergence between + * actual page state and what replay would produce. + */ + memcpy(page, pageData->image, pageHeader->pd_lower); + memset(page + pageHeader->pd_lower, 0, + pageHeader->pd_upper - pageHeader->pd_lower); + memcpy(page + pageHeader->pd_upper, + pageData->image + pageHeader->pd_upper, + BLCKSZ - pageHeader->pd_upper); + + MarkBufferDirty(pageData->buffer); + } + + XLogBeginInsert(); + + /* Register buffers */ + for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++) + { + PageData *pageData = >pages[i]; + + if (BufferIsInvalid(pageData->buffer)) +continue; + if (pageData->flags & GENERIC_XLOG_FULL_IMAGE) { -/* - * A full-page image does not require us to supply any xlog - * data. Just apply the image, being careful to zero the - * "hole" between pd_lower and pd_upper in order to avoid - * divergence between actual page state and what replay would - * produce. - */ -memcpy(page, pageData->image, pageHeader->pd_lower); -memset(page + pageHeader->pd_lower, 0, - pageHeader->pd_upper - pageHeader->pd_lower); -memcpy(page + pageHeader->pd_upper, - pageData->image + pageHeader->pd_upper, - BLCKSZ - pageHeader->pd_upper); - XLogRegisterBuffer(i, pageData->buffer, REGBUF_FORCE_IMAGE | REGBUF_STANDARD); } else { -/* - * In normal mode, calculate delta and write it as xlog data - * associated with this page. - */ -computeDelta(pageData, page, (Page) pageData->image); - -/* Apply the image, with zeroed "hole" as above */ -memcpy(page, pageData->image, pageHeader->pd_lower); -memset(page + pageHeader->pd_lower, 0, - pageHeader->pd_upper - pageHeader->pd_lower); -memcpy(page + pageHeader->pd_upper, - pageData->image + pageHeader->pd_upper, - BLCKSZ - pageHeader->pd_upper); - XLogRegisterBuffer(i, pageData->buffer, REGBUF_STANDARD); XLogRegisterBufData(i, pageData->delta, pageData->deltaLen); } @@ -402,7 +405,7 @@ GenericXLogFinish(GenericXLogState *state) /* Insert xlog record */ lsn = XLogInsert(RM_GENERIC_ID, 0); - /* Set LSN and mark buffers dirty */ + /* Set LSN */ for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++) { PageData *pageData = >pages[i]; @@ -410,7 +413,6 @@ GenericXLogFinish(GenericXLogState *state) if (BufferIsInvalid(pageData->buffer)) continue; PageSetLSN(BufferGetPage(pageData->buffer), lsn); - MarkBufferDirty(pageData->buffer); } END_CRIT_SECTION(); } -- 2.34.1
Re: [PATCH] Add inline comments to the pg_hba_file_rules view
Hi! On 26.09.23 15:19, Peter Eisentraut wrote: On 04.09.23 11:54, Jim Jones wrote: This patch proposes the column "comment" to the pg_hba_file_rules view. It basically parses the inline comment (if any) of a valid pg_hba.conf entry and displays it in the new column. For such pg_hba entries ... host db jim 127.0.0.1/32 md5 # foo host db jim 127.0.0.1/32 md5 #bar host db jim 127.0.0.1/32 md5 # #foo# I'm skeptical about this. First, there are multiple commenting styles. The end-of-line style is less common in my experience, because pg_hba.conf lines tend to belong. Another style is # foo host db jim 127.0.0.1/32 md5 # bar host db jim 127.0.0.1/32 md5 or even as a block # foo and bar host db jim 127.0.0.1/32 md5 host db jim 127.0.0.1/32 md5 Another potential problem is that maybe people don't want their comments leaked out of the file. Who knows what they have written in there. I also considered this for a while. That's why I suggested only inline comments. On a second thought, I agree that making only certain types of comments "accessible" by the pg_hba_file_rules view can be misleading and can possibly leak sensible info if misused. I think we should leave file comments be file comments. If we want some annotations to be exported to higher-level views, we should make that an intentional and explicit separate feature. My first suggestion [1] was to use a different character (other than '#'), but a good point was made, that it would add more complexity to the hba.c, which is already complex enough. My main motivation with this feature is to be able to annotate pg_hba entries in a way that it can be read using the pg_hba_file_rule via SQL - these annotations might contain information like tags, client (application) names or any relevant info regarding the granted access. This info would help me to generate some reports that contain client access information. I can sort of achieve something similar using pg_read_file(),[2] but I thought it would be nice to have it directly from the view. Do you think that this feature is in general not a good idea? Or perhaps a different annotation method would address your concerns? Thank you very much for taking a look into it! Jim 1- https://www.postgresql.org/message-id/flat/3fec6550-93b0-b542-b203-b0054aaee...@uni-muenster.de 2- https://www.postgresql.org/message-id/b63625ca-580f-14dc-7e7c-f90cd4d95cf7%40uni-muenster.de
Re: [PGdocs] fix description for handling pf non-ASCII characters
"Karl O. Pinc" writes: > For the last hunk you'd change around "anything". Write: > "... it will be truncated to less than NAMEDATALEN characters and > the bytes of the string which are not printable ASCII characters ...". > Notice that I have also changed "that" to "which" just above. > I _think_ this is better English. No, I'm pretty sure you're mistaken. It's been a long time since high school English, but the way I think this works is that "that" introduces a restrictive clause, which narrows the scope of what you are saying. That is, you say "that" when you want to talk about only the bytes of the string that aren't ASCII. But "which" introduces a non-restrictive clause that adds information or commentary. If you say "bytes of the string which are not ASCII", you are effectively making a side assertion that no byte of the string is ASCII. Which is not the meaning you want here. A smell test that works for native speakers (not sure how helpful it is for others) is: if the sentence would read well with commas or parens added before and after the clause, then it's probably non-restrictive and should use "which". If it looks wrong that way then it's a restrictive clause and should use "that". regards, tom lane
Re: Questions about the new subscription parameter: password_required
On Tue, Sep 26, 2023 at 1:00 PM Jeff Davis wrote: > As I said earlier, I think the best thing to do is to just have a > section that describes when to use password_required, what specific > things you should do to satisfy that case, and what caveats you should > avoid. Something like: > > "If you want to have a subscription using a connection string without > a password managed by a non-superuser, then: [ insert SQL steps here ]. > Warning: if the connection string doesn't contain a password, make sure > to set password_required=false before transferring ownership, otherwise > it will start failing." > > Documenting the behavior is good, too, but I find the behavior > difficult to document, so examples will help. Yeah, I think something like that could make sense, with an appropriate amount of word-smithing. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eager page freeze criteria clarification
On Tue, Sep 26, 2023 at 11:11 AM Andres Freund wrote: > That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint > rather than the LSN of the previou vacuum." - I can think of ways of doing so > that could end up with quite different behaviour... Yeah, me too. I'm not sure what is best. > As long as the most extreme cases are prevented, unnecessarily freezing is imo > far less harmful than freezing too little. > > I'm worried that using something as long as 100-200% > time-between-recent-checkpoints won't handle insert-mostly workload well, > which IME are also workloads suffering quite badly under our current scheme - > and which are quite common. I wrote about this problem in my other reply and I'm curious as to your thoughts about it. Basically, suppose we forget about all of Melanie's tests except for three cases: (1) an insert-only table, (2) an update-heavy workload with uniform distribution, and (3) an update-heavy workload with skew. In case (1), freezing is good. In case (2), freezing is bad. In case (3), freezing is good for cooler pages and bad for hotter ones. I postulate that any recency-of-modification threshold that handles (1) well will handle (2) poorly, and that the only way to get both right is to take some other factor into account. You seem to be arguing that we can just freeze aggressively in case (2) and it won't cost much, but it doesn't sound to me like Melanie believes that and I don't think I do either. > > This doesn't seem completely stupid, but I fear it would behave > > dramatically differently on a workload a little smaller than s_b vs. > > one a little larger than s_b, and that doesn't seem good. > > Hm. I'm not sure that that's a real problem. In the case of a workload bigger > than s_b, having to actually read the page again increases the cost of > freezing later, even if the workload is just a bit bigger than s_b. That is true, but I don't think it means that there is no problem. It could lead to a situation where, for a while, a table never needs any significant freezing, because we always freeze aggressively. When it grows large enough, we suddenly stop freezing it aggressively, and now it starts experiencing vacuums that do a whole bunch of work all at once. A user who notices that is likely to be pretty confused about what happened, and maybe not too happy when they find out. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [PGdocs] fix description for handling pf non-ASCII characters
On Tue, 26 Sep 2023 13:40:26 + "Hayato Kuroda (Fujitsu)" wrote: > Your effort is quite helpful for me. You're welcome. > Before replying your comments, I thought I should show the difference > between versions. Regarding old versions (here PG15 was used), > non-ASCIIs (like Japanese) are replaced with '?'. > > ``` > psql (15.4) > Type "help" for help. > > postgres=# SET application_name TO 'あああ'; > SET > postgres=# SHOW application_name ; > application_name > -- > ? > (1 row) > ``` > > As for the HEAD, as my patch said, non-ASCIIs are replaced > with hexadecimal representations. (Were my terminologies correct?). > > ``` > psql (17devel) > Type "help" for help. > > postgres=# SET application_name TO 'あああ'; > SET > postgres=# SHOW application_name ; >application_name > -- > \xe3\x81\x82\xe3\x81\x82\xe3\x81\x82 > (1 row) > ``` I think you're terminology is correct, but see my suggestion at bottom. Never hurts to have output to look at. I noticed here and when reading the patch that changed the output -- the patch is operating on bytes, not characters per-se. > > First, it is now best practice in the PG docs to > > put a line break at the end of each sentence. > > At least for the sentences on the lines you change. > > (No need to update the whole document!) Please > > do this in the next version of your patch. I don't > > know if this is a requirement for acceptance by > > a committer, but it won't hurt. > > I didn't know that. Could you please tell me if you have a source? I thought I could find an email but the search is taking forever. If I find something I'll let you know. I could even be mis-remembering, but it's a nice practice regardless. There are a number of undocumented conventions. Another is the use of gender neutral language. The place to discuss doc conventions/styles would be the pgsql-docs list. (If you felt like asking there.) You could try submitting another patch to add various doc conventions to the official docs at https://www.postgresql.org/docs/current/docguide-style.html :-) > Anyway, I put a line break for each sentences for now. Thanks. A related thing that's nice to have is to limit the line length of the documentation source to 80 characters or less. 79 is probably best. Since the source text around your patch conforms to this convention you should also. > IIUC the word " Postgres" cannot be used in the doc. I think you're right. > Based on your all comments, I changed as below. How do you think? > > ``` > Characters that are not printable ASCII, like > \x03, are replaced with the > PostgreSQL linkend="sql-syntax-strings-escape">C-style escaped hexadecimal byte > value. ``` > > > Since the both of you have looked and confirmed that the > > actual behavior matches the new documentation I have not > > done this. > > I showed the result again, please see. Should the committer be interested, your patch applies cleanly and the docs build as expected. Also, based on the comments in the patch which changed the system's behavior, I believe that your patch updates all the relevant places in the documentation. > > But, have either of you checked that we're really talking about > > replacing everything outside the 7-bit ASCII encodings? > > My reading of the commit referenced in the first email of this > > thread says that it's everything outside of the _printable_ > > ASCII encodings, ASCII values outside of the range 32 to 127, > > inclusive. > > > > Please check. The docs should probably say "printable ASCII", > > or "non-printable ASCII", depending. I think the meaning > > of "printable ASCII" is widely enough known to be 32-127. > > So "printable" is good enough, right? > > For me, "non-printable ASCII" sounds like control characters. So that > I used the sentence "Characters that are not printable ASCII ... are > replaced with...". Please tell me if you have better explanation? Your explanation sounds good to me. I now think that you should consider another change to your wording. Instead of starting with "Characters that are not printable ASCII ..." consider writing "The bytes of the string which are not printable ASCII ...". Notice above that characters (e.g. あ) generate output for each non-ASCII byte (e.g. \xe3\x81\x82). So my thought is that the docs should be talking about bytes. For the last hunk you'd change around "anything". Write: "... it will be truncated to less than NAMEDATALEN characters and the bytes of the string which are not printable ASCII characters ...". Notice that I have also changed "that" to "which" just above. I _think_ this is better English. See sense 3 of: https://en.wiktionary.org/wiki/which See also the first paragraph of: https://en.wikipedia.org/wiki/Relative_pronoun If the comments above move you, send another patch. Seems to me we're close to sending this on to a committer. Regards, Karl Free
Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)
Em ter., 26 de set. de 2023 às 09:30, Ranier Vilela escreveu: > Em ter., 26 de set. de 2023 às 07:34, Ashutosh Bapat < > ashutosh.bapat@gmail.com> escreveu: > >> On Tue, Sep 26, 2023 at 3:32 PM David Rowley >> wrote: >> > >> > find_base_rel() could be made more robust for free by just casting the >> > relid and simple_rel_array_size to uint32 while checking that relid < >> > root->simple_rel_array_size. The 0th element should be NULL anyway, >> > so "if (rel)" should let relid==0 calls through and allow that to >> > ERROR still. I see that just changes a "jle" to "jnb" vs adding an >> > additional jump for Ranier's version. [1] >> >> That's a good suggestion. >> >> I am fine with find_base_rel() as it is today as well. But >> future-proofing it seems to be fine too. >> >> > >> > It seems worth not making find_base_rel() more expensive than it is >> > today as commonly we just reference root->simple_rel_array[n] directly >> > anyway because it's cheaper. It would be nice if we didn't add further >> > overhead to find_base_rel() as this would make the case for using >> > PlannerInfo.simple_rel_array directly even stronger. >> >> I am curious, is the overhead in find_base_rel() impacting overall >> performance? >> > It seems to me that it adds a LEA instruction. > https://godbolt.org/z/b4jK3PErE > > Although it doesn't seem like much, > I believe the solution (casting to unsigned) seems better. > So +1. > As suggested, casting is the best option that does not add any overhead and improves the robustness of the find_base_rel function. I propose patch v1, with the additional addition of fixing the find_base_rel_ignore_join function, which despite not appearing in Coverity reports, suffers from the same problem. Taking advantage, I also propose a scope reduction, as well as the const of the root parameter, which is very appropriate. best regards, Ranier Vilela v1-0001-Avoid-possible-out-of-bounds-access.patch Description: Binary data
Re: Questions about the new subscription parameter: password_required
On Tue, 2023-09-26 at 18:21 +0200, Benoit Lobréau wrote: > On 9/26/23 16:27, Benoit Lobréau wrote: > > I will try to come up with a documentation patch. > > This is my attempt at a documentation patch. > + If the ownership of a subscription with password_required=true + is transferred to a non-superuser, they will gain full control over the subscription + but will not be able to modify it's connection string. I think you mean false, right? + If the ownership of a subscription with password_required=true + has been transferred to a non-superuser, it must be reverted to a superuser for + the DROP operation to succeed. That's only needed if the superuser transfers a subscription with password_required=true to a non-superuser and the connection string does not contain a password. In that case, the subscription is already in a failing state, not just for DROP. Ideally we'd have some other warning in the docs not to do that -- maybe in CREATE and ALTER. Also, if the subscription is in that kind of failing state, there are other ways to get out of it as well, like disabling it and setting connection=none, then dropping it. The whole thing is fairly delicate. As soon as you work slightly outside of the intended use, password_required starts causing unexpected things to happen. As I said earlier, I think the best thing to do is to just have a section that describes when to use password_required, what specific things you should do to satisfy that case, and what caveats you should avoid. Something like: "If you want to have a subscription using a connection string without a password managed by a non-superuser, then: [ insert SQL steps here ]. Warning: if the connection string doesn't contain a password, make sure to set password_required=false before transferring ownership, otherwise it will start failing." Documenting the behavior is good, too, but I find the behavior difficult to document, so examples will help. Regards, Jeff Davis
Re: Index AmInsert Parameter Confused?
On Tue, 26 Sept 2023 at 18:38, jacktby jacktby wrote: > > typedef bool (*aminsert_function) (Relation indexRelation, > Datum *values, > bool *isnull, > ItemPointer heap_tid, > Relation heapRelation, > IndexUniqueCheck checkUnique, > bool indexUnchanged, > struct IndexInfo *indexInfo); > > Why is there a heap_tid, We haven’t inserted the value, so where does it from > ? Index insertion only happens after the TableAM tuple has been inserted. As indexes refer to locations in the heap, this TID contains the TID of the table tuple that contains the indexed values, so that the index knows which tuple to refer to. Note that access/amapi.h describes only index AM APIs; it does not cover the table AM APIs descibed in access/tableam.h Kind regards, Matthias van de Meent
Re: False "pg_serial": apparent wraparound” in logs
On 25/08/2023 07:29, Imseih (AWS), Sami wrote: diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c index 1af41213b4..7e7be3b885 100644 --- a/src/backend/storage/lmgr/predicate.c +++ b/src/backend/storage/lmgr/predicate.c @@ -992,6 +992,13 @@ SerialSetActiveSerXmin(TransactionId xid) serialControl->tailXid = xid; + /* +* If the SLRU is being used, set the latest page number to +* the current tail xid. +*/ + if (serialControl->headPage > 0) + SerialSlruCtl->shared->latest_page_number = SerialPage(serialControl->tailXid); + LWLockRelease(SerialSLRULock); } I don't really understand what exactly the problem is, or how this fixes it. But this doesn't feel right: Firstly, isn't headPage == 0 also a valid value? We initialize headPage to -1 when it's not in use. Secondly, shouldn't we set it to the page corresponding to headXid rather than tailXid. Thirdly, I don't think this code should have any business setting latest_page_number directly. latest_page_number is set in SimpleLruZeroPage(). Are we missing a call to SimpleLruZeroPage() somewhere? -- Heikki Linnakangas Neon (https://neon.tech)
Re: Questions about the new subscription parameter: password_required
On 9/26/23 16:27, Benoit Lobréau wrote: I will try to come up with a documentation patch. This is my attempt at a documentation patch. -- Benoit Lobréau Consultant http://dalibo.comFrom a73baa91032fff37ef039168c276508553830f86 Mon Sep 17 00:00:00 2001 From: benoit Date: Tue, 26 Sep 2023 18:07:47 +0200 Subject: [PATCH] Doc patch for require_password Add documentation to ALTER / DROP SUBSCRIPTION regarding non-superuser subscriptions with require_password=true. --- doc/src/sgml/ref/alter_subscription.sgml | 3 +++ doc/src/sgml/ref/drop_subscription.sgml | 6 ++ 2 files changed, 9 insertions(+) diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml index a85e04e4d6..0bbe7e2335 100644 --- a/doc/src/sgml/ref/alter_subscription.sgml +++ b/doc/src/sgml/ref/alter_subscription.sgml @@ -51,6 +51,9 @@ ALTER SUBSCRIPTION name RENAME TO < to alter the owner, you must be able to SET ROLE to the new owning role. If the subscription has password_required=false, only superusers can modify it. + If the ownership of a subscription with password_required=true + is transferred to a non-superuser, they will gain full control over the subscription + but will not be able to modify it's connection string. diff --git a/doc/src/sgml/ref/drop_subscription.sgml b/doc/src/sgml/ref/drop_subscription.sgml index 2a67bdea91..8ec743abd0 100644 --- a/doc/src/sgml/ref/drop_subscription.sgml +++ b/doc/src/sgml/ref/drop_subscription.sgml @@ -102,6 +102,12 @@ DROP SUBSCRIPTION [ IF EXISTS ] nameDROP SUBSCRIPTION cannot be executed inside a transaction block. + + + If the ownership of a subscription with password_required=true + has been transferred to a non-superuser, it must be reverted to a superuser for + the DROP operation to succeed. + -- 2.41.0
Re: Remove MSVC scripts from the tree
On 2023-09-26 Tu 01:25, NINGWEI CHEN wrote: On Fri, 22 Sep 2023 10:12:29 +0900 Michael Paquier wrote: As of today, I can see that the only buildfarm members relying on these scripts are bowerbird and hamerkop, so these two would fail if the patch attached were to be applied today. I am adding Andrew D. and the maintainers of hamerkop (SRA-OSS) in CC for comments. hamerkop is not yet prepared for Meson builds, but we plan to work on this support soon. If we go with Meson builds exclusively right now, we will have to temporarily remove the master/HEAD for a while. Best Regards. I don't think we should switch to that until you're ready. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: Eager page freeze criteria clarification
On Tue, Sep 26, 2023 at 8:19 AM Andres Freund wrote: > However, I'm not at all convinced doing this on a system wide level is a good > idea. Databases do often contain multiple types of workloads at the same > time. E.g., we want to freeze aggressively in a database that has the bulk of > its size in archival partitions but has lots of unfrozen data in an active > partition. And databases have often loads of data that's going to change > frequently / isn't long lived, and we don't want to super aggressively freeze > that, just because it's a large portion of the data. I didn't say that we should always have most of the data in the database frozen, though. Just that we can reasonably be more lazy about freezing the remainder of pages if we observe that most pages are already frozen. How they got that way is another discussion. I also think that the absolute amount of debt (measured in physical units such as unfrozen pages) should be kept under control. But that isn't something that can ever be expected to work on the basis of a simple threshold -- if only because autovacuum scheduling just doesn't work that way, and can't really be adapted to work that way. -- Peter Geoghegan
Re: [PATCH] Add inline comments to the pg_hba_file_rules view
Also a reluctant -1, as the comment-at-EOL style is very rare in my experience over the years of seeing many a pg_hba file.
Re: pg_rewind with cascade standby doesn't work well
Hi, > >> IMO a test is needed that makes sure no one is going to break this in > >> the future. > > > > You definitely need more complex test scenarios for that. If you can > > come up with new ways to make the TAP tests of pg_rewind mode modular > > in handling more complicated node setups, that would be a nice > > addition, for example. > > I'm sorry for lacking tests. For now, I started off with a simple test > that cause the problem I mentioned. The updated WIP patch 0001 includes > the new test for pg_rewind. Many thanks for a quick update. > And also, I'm afraid that I'm not sure what kind of tests I have to make > for fix this behavior. Would you mind giving me some advice? Personally I would prefer not to increase the scope of work. Your TAP test added in 0001 seems to be adequate. > BTW, I was able to > reproduce the assertion failure Kuwamura-san reported, even after applying > your latest patch from the thread. Do you mean that the test fails or it doesn't but there are other steps to reproduce the issue? -- Best regards, Aleksander Alekseev
Re: XLog size reductions: smaller XLRec block header for PG17
Hi, > This scheme is reused later for the XLogRecord xl_tot_len field over > at [0], and FWIW is thus being reused. Sure, it's tailored to this WAL > use case, but IMO we're getting good value from it. We don't use > protobuf or JSON for WAL, we use our own serialization format. Having > some specialized encoding/decoding in that format for certain fields > is IMO quite acceptable. > > > Yes - I proposed that and wrote an implementation of reasonably efficient > > varint encoding. Here's my prototype: > > https://postgr.es/m/20221004234952.anrguppx5owewb6n%40awork3.anarazel.de > > As I mentioned on that thread, that prototype has a significant > probability of doing nothing to improve WAL size, or even increasing > the WAL size for installations which consume a lot of OIDs. > > > I think it's a bad tradeoff to write lots of custom varint encodings, just > > to > > eek out a bit more space savings. > > This is only a single "custom" varint encoding though, if you can even > call it that. It makes a field's size depend on flags set in another > byte, which is not that much different from the existing use of > XLR_BLOCK_ID_DATA_[LONG, SHORT]. > > > The increase in code complexity IMO makes it a bad tradeoff. > > Pardon me for asking, but what would you consider to be a good > tradeoff then? I think the code relating to the WAL storage format is > about as simple as you can get it within the feature set it provides > and the size of the resulting records. While I think there is still > much to gain w.r.t. WAL record size, I don't think we can get much of > those improvements without adding at least some amount of complexity, > something I think to be true for most components in PostgreSQL. > > So, except for redesigning significant parts of the public WAL APIs, > are we just going to ignore any potential improvements because they > "increase code complexity"? Here are my two cents. I definitely see the value in having reusable varint encoding. This would probably be a good option for extendable TOAST pointers, compression dictionaries, and perhaps other patches. In this particular case however Matthias has good arguments that this is not the right tool for this particular task, IMO. We don't use rbtrees for everything that needs a map from x to y. Hash tables have other compromises. Sometimes one container is a better fit, sometimes the other, sometimes none and we implement a fine tuned container for the given case. Same here. -- Best regards, Aleksander Alekseev
Re: SET ROLE documentation improvement
On Mon, Sep 25, 2023 at 3:09 PM Nathan Bossart wrote: > On Fri, Sep 15, 2023 at 02:36:16PM -0700, Yurii Rashkovskii wrote: > > On Fri, Sep 15, 2023 at 1:47 PM Nathan Bossart > > > wrote: > >> I think another issue is that the aforementioned note doesn't mention > the > >> new SET option added in 3d14e17. > > > > How do you think we should word it in that note to make it useful? > > Maybe something like this: > > The current session user must have the SET option for the specified > role_name, either directly or indirectly via a chain of memberships > with the SET option. > This is a good start, indeed. I've amended my patch to include it. -- Y. V2-0001-Minor-improvement-to-SET-ROLE-documentation.patch Description: Binary data
Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
On Mon, Sep 25, 2023 at 1:56 PM Jeff Davis wrote: > Do users like Bob do that today? If not, what causes you to expect them > to do so in the future? What I would say is that if there's a reasonable way of securing your stuff and you don't make use of it, that's your problem. If securing your stuff is unreasonably difficult, that's a product problem. I think that setting the search_path on your own functions is a basic precaution that you should take if you are worried about multi-user security. I do not believe it is realistic to eliminate that requirement, and if people like Bob don't do that today and can't be made to do that in the future, then I think it's just hopeless. In contrast, I think that the precautions that you need to take when doing anything to a table owned by another user are unreasonably complex and not very realistic for anyone to take on a routine basis. Even if you validate that there's nothing malicious before you access the table, the table owner can change that at any time, so it's very hard to reliably protect yourself. In terms of whether people like Bob actually DO do that today, I'd say probably some do and others don't. I think that the overwhelming majority of PostgreSQL users simply aren't concerned about multi-user security. They either have a single user account that is used for everything, or say one account for the application and another for interactive access, or they have a bunch of users but basically all of those people are freely accessing each other's stuff and they're not really concerned with firewalling them from each other. Among the small percentage of users who are really concerned with making sure that users can't get into each others accounts, I would expect that knowing that you need to control search_path is fairly common, but it's hard to say. I haven't actually met many such users. > > I think it's self-evident that a SQL function's behavior is > > not guaranteed to be invariant under all possible values of > > search_path. > > It's certainly not self-evident in a literal sense. I think you mean > that it's "obvious" or something, and perhaps that narrow question is, > but it's also not terribly helpful. > > If the important behaviors here were so obvious, how did we end up in > this mess in the first place? I feel like this isn't really responsive to the argument that I was and am making, and I'm worried that we're going down a rat-hole here. I wondered after reading this whether I had misused the term self-evident, but when I did a Google search for "self-evident" the definition that comes up is "not needing to be demonstrated or explained; obvious." I am not saying that everyone is going to realize that you probably ought to be setting search_path on all of your functions in any kind of multi-user environment, and maybe even in a single-user environment just to avoid weird failures if you ever change your default search_path. What I am saying is that if you stop to think about what search_path does while looking at any SQL function you've ever written, you should probably realize pretty quickly that the behavior of your function in search_path-dependent, and indeed that the behavior of every other SQL function you've ever written is probably search_path-dependent, too. I think the problem here isn't really that this is hard to understand, but that many people have not stopped to think about it. Maybe it is obvious to you what we ought to do about that, but it is not obvious to me. As I have said, I think that changing the behavior of CREATE FUNCTION or CREATE PROCEDURE so that some search_path control is the default is worth considering. However, I think that such a change inevitably breaks backward compatibility, and I don't think we have enough people weighing in on this thread to think that we can just go do that even if everyone agreed on precisely what was to be done, and I think it is pretty clear that we do not have unanimous agreement. > > Respectfully, I find this position unreasonable, to the point of > > finding it difficult to take seriously. > > Which part exactly is unreasonable? I interpreted you to be saying that we can't expect people to set search_path on their functions. And I just don't buy that. We have made mistakes in that area in PostgreSQL itself and had to fix them later, and we may make more mistakes again in the future, so if you think we need better documentation or better defaults, I think you might be right. But if you think it's a crazy idea for people running PostgreSQL in multi-user environments to understand that setting search_path on all of their functions and procedures is essential, I disagree. They've got to understand that, because it's not that complicated, and there's no real alternative. > I think analogies to unix are what caused a lot of the problems we have > today, because postgres is a very different world. In unix-like > environments, a file is just a file; in postgres, we have all kinds
Re: Eager page freeze criteria clarification
Hi, On 2023-09-25 14:16:46 -0700, Peter Geoghegan wrote: > On Mon, Sep 25, 2023 at 11:45 AM Robert Haas wrote: > I'm surprised that there hasn't been any discussion of the absolute > amount of system-wide freeze debt on this thread. If 90% of the pages > in the entire database are frozen, it'll generally be okay if we make > the wrong call by freezing lazily when we shouldn't. This is doubly > true within small to medium sized tables, where the cost of catching > up on freezing cannot ever be too bad (concentrations of unfrozen > pages in one big table are what really hurt users). I generally agree with the idea of using "freeze debt" as an input - IIRC I proposed using the number of frozen pages vs number of pages as the input to the heuristic in one of the other threads about the topic a while back. I also think we should read a portion of all-visible-but-not-frozen pages during non-aggressive vacuums to manage the freeze debt. However, I'm not at all convinced doing this on a system wide level is a good idea. Databases do often contain multiple types of workloads at the same time. E.g., we want to freeze aggressively in a database that has the bulk of its size in archival partitions but has lots of unfrozen data in an active partition. And databases have often loads of data that's going to change frequently / isn't long lived, and we don't want to super aggressively freeze that, just because it's a large portion of the data. Greetings, Andres Freund
Re: pg_resetwal tests, logging, and docs update
Hi, > Hmm, I think I like "where" better. OK. > Attached is an updated patch set where I have split the changes into > smaller pieces. The last two patches still have some open questions > about what certain constants mean etc. The other patches should be settled. The patches 0001..0005 seem to be ready and rather independent. I suggest merging them and continue discussing the rest of the patches. -- Best regards, Aleksander Alekseev
Re: Eager page freeze criteria clarification
Hi, On 2023-09-25 14:45:07 -0400, Robert Haas wrote: > On Fri, Sep 8, 2023 at 12:07 AM Andres Freund wrote: > > > Downthread, I proposed using the RedoRecPtr of the latest checkpoint > > > rather than the LSN of the previou vacuum. I still like that idea. > > > > Assuming that "downthread" references > > https://postgr.es/m/CA%2BTgmoYb670VcDFbekjn2YQOKF9a7e-kBFoj2WJF1HtH7YPaWQ%40mail.gmail.com > > could you sketch out the logic you're imagining a bit more? > > I'm not exactly sure what the question is here. That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint rather than the LSN of the previou vacuum." - I can think of ways of doing so that could end up with quite different behaviour... > > Perhaps we can mix both approaches. We can use the LSN and time of the last > > vacuum to establish an LSN->time mapping that's reasonably accurate for a > > relation. For infrequently vacuumed tables we can use the time between > > checkpoints to establish a *more aggressive* cutoff for freezing then what a > > percent-of-time-since-last-vacuum appach would provide. If e.g. a table gets > > vacuumed every 100 hours and checkpoint timeout is 1 hour, no realistic > > percent-of-time-since-last-vacuum setting will allow freezing, as all dirty > > pages will be too new. To allow freezing a decent proportion of those, we > > could allow freezing pages that lived longer than ~20% > > time-between-recent-checkpoints. > > Yeah, I don't know if that's exactly the right idea, but I think it's > in the direction that I was thinking about. I'd even be happy with > 100% of the time-between-recent checkpoints, maybe even 200% of > time-between-recent checkpoints. But I think there probably should be > some threshold beyond which we say "look, this doesn't look like it > gets touched that much, let's just freeze it so we don't have to come > back to it again later." As long as the most extreme cases are prevented, unnecessarily freezing is imo far less harmful than freezing too little. I'm worried that using something as long as 100-200% time-between-recent-checkpoints won't handle insert-mostly workload well, which IME are also workloads suffering quite badly under our current scheme - and which are quite common. > I think part of the calculus here should probably be that when the > freeze threshold is long, the potential gains from making it even > longer are not that much. If I change the freeze threshold on a table > from 1 minute to 1 hour, I can potentially save uselessly freezing > that page 59 times per hour, every hour, forever, if the page always > gets modified right after I touch it. If I change the freeze threshold > on a table from 1 hour to 1 day, I can only save 23 unnecessary > freezes per day. Percentage-wise, the overhead of being wrong is the > same in both cases: I can have as many extra freeze operations as I > have page modifications, if I pick the worst possible times to freeze > in every case. But in absolute terms, the savings in the second > scenario are a lot less. I think if a user is accessing a table > frequently, the overhead of jamming a useless freeze in between every > table access is going to be a lot more noticeable then when the table > is only accessed every once in a while. And I also think it's a lot > less likely that we'll reliably get it wrong. Workloads that touch a > page and then touch it again ~N seconds later can exist for all values > of N, but I bet they're way more common for small values of N than > large ones. True. And with larger Ns it also becomes more likely that we'd need to freeze the rows anyway. I've seen tables being hit with several anti-wraparound vacuums a day, but not several anti-wraparound vacuums a minute... > Is there also a need for a similar guard in the other direction? Let's > say that autovacuum_naptime=15s and on some particular table it > triggers every time. I've actually seen this on small queue tables. Do > you think that, in such tables, we should freeze pages that haven't > been modified in 15s? I don't think it matters much, proportionally to the workload of rewriting nearly all of the table every few seconds, the overhead of freezing a bunch of already dirty pages is neglegible. > > Hm, possibly stupid idea: What about using shared_buffers residency as a > > factor? If vacuum had to read in a page to vacuum it, a) we would need read > > IO > > to freeze it later, as we'll soon evict the page via the ringbuffer b) > > non-residency indicates the page isn't constantly being modified? > > This doesn't seem completely stupid, but I fear it would behave > dramatically differently on a workload a little smaller than s_b vs. > one a little larger than s_b, and that doesn't seem good. Hm. I'm not sure that that's a real problem. In the case of a workload bigger than s_b, having to actually read the page again increases the cost of freezing later, even if the workload is just a bit bigger than s_b. Greetings,
Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs
Hi, On Tue, 26 Sept 2023 at 13:48, Peter Eisentraut wrote: > > On 25.09.23 12:56, Nazir Bilal Yavuz wrote: > > + # Only run if a specific OS is not requested and if there are changes in > > docs > > + # or in the CI files. > > + skip: > > > +$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:.*' || > > +!changesInclude('doc/**', > > +'.cirrus.yml', > > +'.cirrus.tasks.yml', > > +'src/backend/catalog/sql_feature_packages.txt', > > +'src/backend/catalog/sql_features.txt', > > +'src/backend/utils/errcodes.txt', > > +'src/backend/utils/activity/wait_event_names.txt', > > + > > 'src/backend/utils/activity/generate-wait_event_types.pl', > > +'src/include/parser/kwlist.h') > > This is kind of annoying. Now we need to maintain yet another list of > these dependencies and keep it in sync with the build systems. I agree. > > I think meson can produce a dependency tree from a build. Maybe we > could use that somehow and have Cirrus cache it between runs? I will check that. > > Also note that there are also dependencies in the other direction. For > example, the psql help is compiled from XML DocBook sources. So your > other patch would also need to include similar changesInclude() clauses. > If there are more cases like this, it may not be worth it. Instead, we can just: - Build the docs when the doc related files are changed (This still creates a dependency like you said). - Skip CI completely if the README files are changed. What are your opinions on these? Regards, Nazir Bilal Yavuz Microsoft
Index AmInsert Parameter Confused?
typedef bool (*aminsert_function) (Relation indexRelation, Datum *values, bool *isnull, ItemPointer heap_tid, Relation heapRelation, IndexUniqueCheck checkUnique, bool indexUnchanged, struct IndexInfo *indexInfo); Why is there a heap_tid, We haven’t inserted the value, so where does it from ?
Re: Questions about the new subscription parameter: password_required
On 9/22/23 21:58, Robert Haas wrote I think that there normally shouldn't be any problem here, because if form->subpasswordrequired is true, we expect that the connection string should contain a password which the remote side actually uses, or we expect the subscription to be owned by the superuser. If neither of those things is the case, then either the superuser made a subscription that doesn't use a password owned by a non-superuser without fixing subpasswordrequired, or else the configuration on the remote side has changed so that it now doesn't use the password when formerly it did. In the first case, perhaps it would be fine to go ahead and drop the slot, but in the second case I don't think it's OK from a security point view, because the command is going to behave the same way no matter who executes the drop command, and a non-superuser who drops the slot shouldn't be permitted to rely on the postgres processes's identity to do anything on a remote node -- including dropping a relation slot. So I tentatively think that this behavior is correct. I must admin I hadn't considered the implication when the configuration on the remote side has changed and we use a non-superuser. I see how it could be problematic. I will try to come up with a documentation patch. Maybe you're altering it in a way that doesn't involve any connections or any changes to the connection string? There's no security issue if, say, you rename it. I looked at the code again. Indeed, of the ALTER SUBSCRIPTION commands, only ALTER SUBSCRIPTION .. CONNECTION uses walrcv_check_conninfo(). I checked the other thread (Re: [16+] subscription can end up in inconsistent state [1]) and will try the patch. Is it the thread you where refering to earlier ? [1] https://www.postgresql.org/message-id/flat/5dff4caf26f45ce224a33a5e18e110b93a351b2f.camel%40j-davis.com#ff4a06505de317b1ad436b8102a69446 -- Benoit Lobréau Consultant http://dalibo.com
Re: Partial aggregates pushdown
On Tue, Sep 26, 2023 at 06:26:25AM +, fujii.y...@df.mitsubishielectric.co.jp wrote: > Hi Mr.Bruce. > > I think this needs to be explained in the docs. I am ready to adjust the > > patch to improve the wording whenever you are > > ready. Should I do it now and post an updated version for you to use? > The following explanation was omitted from the documentation, so I added it. > > * false - no check > > * true, remove version < sender - check > I have responded to your comment, but if there is a problem with the wording, > could you please suggest a correction? I like your new wording, thanks. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
What's the ItemPointer's meaning here?
/* Typedef for callback function for table_index_build_scan */ typedef void (*IndexBuildCallback) (Relation index, ItemPointer tid, Datum *values, bool *isnull, bool tupleIsAlive, void *state); When we build an index on an existed heap table, so pg will read the tuple one by one and give the tuple’s hepatid as this ItemPointer, is that right?
Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
On 9/20/23 20:09, Tomas Vondra wrote: > On 9/20/23 19:59, Tomas Vondra wrote: >> >> >> On 9/20/23 01:24, Tom Lane wrote: >>> Tomas Vondra writes: bsd@freebsd:~ $ tclsh8.6 % clock scan "1/26/2010" time value too large/small to represent >>> >>> In hopes of replicating this, I tried installing FreeBSD 14-BETA2 >>> aarch64 on my Pi 3B. This test case works fine: >>> >>> $ tclsh8.6 >>> % clock scan "1/26/2010" >>> 1264482000 >>> >>> $ tclsh8.7 >>> % clock scan "1/26/2010" >>> 1264482000 >>> >>> and unsurprisingly, pltcl's regression tests pass. I surmise >>> that something is broken in BETA1 that they fixed in BETA2. >>> >>> plpython works too, with the python 3.9 package (and no older >>> python). >>> >>> However, all is not peachy, because plperl doesn't work. >>> Trying to CREATE EXTENSION either plperl or plperlu leads >>> to a libperl panic: >>> >>> pl_regression=# create extension plperl; >>> server closed the connection unexpectedly >>> This probably means the server terminated abnormally >>> before or while processing the request. >>> The connection to the server was lost. Attempting reset: Succeeded. >>> >>> with this in the postmaster log: >>> >>> panic: pthread_key_create failed >>> >>> That message is certainly not ours, so it must be coming out of libperl. >>> >>> Another thing that seemed strange is that ecpg's preproc.o takes >>> O(forever) to compile. I killed the build after observing that the >>> compiler had gotten to 40 minutes of CPU time, and redid that step >>> with PROFILE=-O0, which allowed it to compile in 20 seconds or so. >>> (I also tried -O1, but gave up after a few minutes.) This machine >>> can compile the main backend grammar in a minute or two, so there is >>> something very odd there. >>> >>> I'm coming to the conclusion that 14-BETA is, well, beta grade. >>> I'll be interested to see if you get the same results when you >>> update to BETA2. >> >> Thanks, I'll try that when I'll be at the office next week. >> > > FWIW when I disabled tcl, the tests pass (it's running with --nostatus > --nosend, so it's not visible on the buildfarm site). Including the > plperl stuff: > > == running regression test queries== > test plperl ... ok 397 ms > test plperl_lc... ok 152 ms > test plperl_trigger ... ok 374 ms > test plperl_shared... ok 163 ms > test plperl_elog ... ok 184 ms > test plperl_util ... ok 210 ms > test plperl_init ... ok 150 ms > test plperlu ... ok 117 ms > test plperl_array ... ok 228 ms > test plperl_call ... ok 189 ms > test plperl_transaction ... ok 412 ms > test plperl_plperlu ... ok 238 ms > > == > All 12 tests passed. > == > > I wonder if this got broken between BETA1 and BETA2. > Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure: select tcl_date_week(2010,1,26); - tcl_date_week - 04 -(1 row) - +ERROR: time value too large/small to represent +CONTEXT: time value too large/small to represent +while executing +"ConvertLocalToUTC $date[set date {}] $TZData($timezone) 2361222" +(procedure "FreeScan" line 86) +invoked from within +"FreeScan $string $base $timezone $locale" +(procedure "::tcl::clock::scan" line 68) +invoked from within +"::tcl::clock::scan 1/26/2010" +("uplevel" body line 1) +invoked from within +"uplevel 1 [info level 0]" +(procedure "::tcl::clock::scan" line 4) +invoked from within +"clock scan "$2/$3/$1"" +(procedure "__PLTcl_proc_55335" line 3) +invoked from within +"__PLTcl_proc_55335 2010 1 26" +in PL/Tcl function "tcl_date_week" select tcl_date_week(2001,10,24); I wonder what's the difference between the systems ... All I did was writing the BETA2 image to SD card, and install a couple packages: pkg install xml2c libxslt gettext-tools ccache tcl tcl87 \ p5-Test-Harness p5-IPC-Run gmake htop bash screen \ python tcl86 nano p5-Test-LWP-UserAgent \ p5-LWP-Protocol-https And then perl ./run_branches.pl --run-all --nosend --nostatus --verbose with the buildfarm config used by dikkop. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Invalidate the subscription worker in cases where a user loses their superuser status
On Tue, 26 Sept 2023 at 13:03, Peter Smith wrote: > > Here are some comments for patch v2-0001. > > == > src/backend/replication/logical/worker.c > > 1. maybe_reread_subscription > > ereport(LOG, > (errmsg("logical replication worker for subscription \"%s\" > will restart because of a parameter change", > MySubscription->name))); > > Is this really a "parameter change" though? It might be a stretch to > call the user role change a subscription parameter change. Perhaps > this case needs its own LOG message? When I was doing this change the same thought had come to my mind too but later I noticed that in case of owner change there was no separate log message. Since superuser check is somewhat similar to owner change, I felt no need to make any change for this. > == > src/include/catalog/pg_subscription.h > > 2. > char*origin; /* Only publish data originating from the > * specified origin */ > + bool isownersuperuser; /* Is subscription owner superuser? */ > } Subscription; > > > Is a new Subscription field member really necessary? The Subscription > already has 'owner' -- why doesn't function > maybe_reread_subscription() just check: > > (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner)) We need the new variable so that we store this value when the worker is started and check the current value with the value that was when the worker was started. Since we need the value of the superuser status when the worker is started, I feel this variable is required. > == > src/test/subscription/t/027_nosuperuser.pl > > 3. > # The apply worker should get restarted after the superuser prvileges are > # revoked for subscription owner alice. > > typo > > /prvileges/privileges/ >. > ~ I will change this in the next version > 4. > +# After the user becomes non-superuser the apply worker should be restarted > and > +# it should fail with 'password is required' error as password option is not > +# part of the connection string. > > /as password option/because the password option/ I will change this in the next version Regards, Vignesh
Re: [PATCH] Add inline comments to the pg_hba_file_rules view
> On 26 Sep 2023, at 15:19, Peter Eisentraut wrote: > > On 04.09.23 11:54, Jim Jones wrote: >> This patch proposes the column "comment" to the pg_hba_file_rules view. It >> basically parses the inline comment (if any) of a valid pg_hba.conf entry >> and displays it in the new column. >> For such pg_hba entries ... >> host db jim 127.0.0.1/32 md5 # foo >> host db jim 127.0.0.1/32 md5 #bar >> host db jim 127.0.0.1/32 md5 # #foo# > > I'm skeptical about this. > > First, there are multiple commenting styles. The end-of-line style is less > common in my experience, because pg_hba.conf lines tend to belong. Another > style is > > # foo > host db jim 127.0.0.1/32 md5 > # bar > host db jim 127.0.0.1/32 md5 > > or even as a block > > # foo and bar > host db jim 127.0.0.1/32 md5 > host db jim 127.0.0.1/32 md5 Or even a more complicated one (which I've seen variants of in production) where only horizontal whitespace separates two subsequent lines of comments: # Block comment host db jim 127.0.0.1/32 md5 #end of line multi- #line comment # A new block comment directly following host db jim 127.0.0.1/32 md5 > I think we should leave file comments be file comments. If we want some > annotations to be exported to higher-level views, we should make that an > intentional and explicit separate feature. +1 -- Daniel Gustafsson
RE: Move global variables of pgoutput to plugin private scope.
On Tuesday, September 26, 2023 4:40 PM Amit Kapila wrote: > > On Tue, Sep 19, 2023 at 12:48 PM Zhijie Hou (Fujitsu) > wrote: > > > > - static bool publish_no_origin; > > > > This flag is also local to pgoutput instance, and we didn't reset the > > flag in output shutdown callback, so if we consume changes from > > different slots, then the second call would reuse the flag value that > > is set in the first call which is unexpected. To completely avoid this > > issue, we think we'd better move this flag to output plugin private data > structure. > > > > Example: > > SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_1', > NULL, NULL, 'proto_version', '1', 'publication_names', 'pub', 'origin', > 'none'); --- > Set origin in this call. > > SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_2', > NULL, NULL, 'proto_version', '1', 'publication_names', 'pub'); -- Didn't set > origin, but will reuse the origin flag in the first call. > > > > char*origin; > + bool publish_no_origin; > } PGOutputData; > > Do we really need a new parameter in above structure? Can't we just use the > existing origin in the same structure? Please remember if this needs to be > backpatched then it may not be good idea to add new parameter in the > structure but apart from that having two members to represent similar > information doesn't seem advisable to me. I feel for backbranch we can just > use > PGOutputData->origin for comparison and for HEAD, we can remove origin > and just use a boolean to avoid any extra cost for comparisions for each > change. OK, I agree to remove the origin string on HEAD and we can add that back when we support other origin value. I also modified to use the string for comparison as suggested for back-branch. I will also test it locally to confirm it doesn't affect the perf. > > Can we add a test case to cover this case? Added one in replorigin.sql. Attach the patch set for publish_no_origin and in_streaming including the patch(v2-PG16-0001) for back-branches. Since the patch for hash table need more thoughts, I didn't post it this time. Best Regards, Hou zj v2-0001-Maintain-publish_no_origin-in-output-plugin-priv.patch Description: v2-0001-Maintain-publish_no_origin-in-output-plugin-priv.patch v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva.patch Description: v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva.patch v2-0002-Move-in_streaming-to-output-private-data.patch Description: v2-0002-Move-in_streaming-to-output-private-data.patch
RE: Move global variables of pgoutput to plugin private scope.
On Tuesday, September 19, 2023 1:44 PM Michael Paquier wrote: > > On Tue, Sep 19, 2023 at 04:10:39AM +, Zhijie Hou (Fujitsu) wrote: > > Currently we have serval global variables in pgoutput, but each of > > them is inherently local to an individual pgoutput instance. This > > could cause issues if we switch to different output plugin instance in > > one session and could miss to reset their value in case of errors. The > > analysis for each variable is as > > follows: > > (Moved the last block of the message as per the relationship between > RelationSyncCache and publications_valid). > > > - static HTAB *RelationSyncCache = NULL; > > > > pgoutput creates this hash table under cacheMemoryContext to remember > > the relation schemas that have been sent, but it's local to an > > individual pgoutput instance, and because it's under global memory > > context, the hashtable is not properly cleared in error paths which > > means it has a risk of being accessed in a different output plugin instance. > This was also mentioned in another thread[2]. > > > > So I think we'd better allocate this under output plugin private context. > > > > But note that, instead of completely moving the hash table into the > > output plugin private data, we need to to keep the static pointer > > variable for the map to be accessed by the syscache callbacks. This is > > because syscache callbacks won't be un-registered even after shutting > > down the output plugin, so we need a static pointer to cache the map pointer > so that callbacks can check it. > > > > - static bool publications_valid; > > > > I thought we need to move this to private data as well, but we need to > > access this in a syscache callback, which means we need to keep the static > variable. > > FWIW, I think that keeping publications_valid makes the code kind of confusing > once 0001 is applied, because this makes the handling of the cached data for > relations and publications even more inconsistent than it is now, with a mixed > bag of two different logics caused by the relationship between the synced > relation cache and the publication > cache: RelationSyncCache tracks if relations should be rebuilt, while > publications_valid does it for the publication data, but both are still > static and > could be shared by multiple pgoutput contexts. On top of that, > publications_valid is hidden at the top of pgoutput.c within a bunch of > declarations and no comments to explain why it's here (spoiler: to handle the > cache rebuilds with its reset in the cache callback). > > I agree that CacheMemoryContext is not really a good idea to cache the data > only proper to a pgoutput session and that tracking a context in the output > data makes the whole cleanup attractive, but it also seems to me that we > should avoid entirely the use of relcache callbacks if the intention is to > have one > RelationSyncEntry per pgoutput. The patch does something different than > HEAD and than having one RelationSyncEntry per pgoutout: RelationSyncEntry > can reference *everything*, with its data stored in multiple memory contexts > as > of one per pgoutput. It looks like RelationSyncEntry should be a list or a > hash > table, at least, so as it can refer to multiple pgoutput states. Knowing > that a > session can only use one replication slot with MyReplicationSlot, not sure > that's > worth bothering with. As a whole, > 0001 with its changes for RelationSyncCache don't seem like an improvement > to me. > Thanks for your comments. Currently, I am not sure how to avoid the use of the syscache callback functions, So I think the change for RelationSyncCache needs more thought and I will retry later if I find another way. Best Regards, Hou zj
RE: [PGdocs] fix description for handling pf non-ASCII characters
Dear Karl, Thank you for reviewing! PSA new version. > I see a few problems with the English and style of the patches > and am commenting below and have signed up as a reviewer. Your effort is quite helpful for me. > At > commitfest.postgresql.org I have marked the thread > as needing author attention. Hayato, you will need > to mark the thread as needing review when you have > replied to this message. Sure. I will change the status after posting the patch. Before replying your comments, I thought I should show the difference between versions. Regarding old versions (here PG15 was used), non-ASCIIs (like Japanese) are replaced with '?'. ``` psql (15.4) Type "help" for help. postgres=# SET application_name TO 'あああ'; SET postgres=# SHOW application_name ; application_name -- ? (1 row) ``` As for the HEAD, as my patch said, non-ASCIIs are replaced with hexadecimal representations. (Were my terminologies correct?). ``` psql (17devel) Type "help" for help. postgres=# SET application_name TO 'あああ'; SET postgres=# SHOW application_name ; application_name -- \xe3\x81\x82\xe3\x81\x82\xe3\x81\x82 (1 row) ``` Note that non-printable ASCIIs are also replaced with the same rule. ``` psql (15.4) Type "help" for help. postgres=# SET application_name TO E'\x03'; SET postgres=# SHOW application_name ; application_name -- ? (1 row) psql (17devel) Type "help" for help. postgres=# SET application_name TO E'\x03'; SET postgres=# SHOW application_name ; application_name -- \x03 (1 row) ``` > Now, to reviewing the patch: > > First, it is now best practice in the PG docs to > put a line break at the end of each sentence. > At least for the sentences on the lines you change. > (No need to update the whole document!) Please > do this in the next version of your patch. I don't > know if this is a requirement for acceptance by > a committer, but it won't hurt. I didn't know that. Could you please tell me if you have a source? Anyway, I put a line break for each sentences for now. > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml > index e700782d3c..a4ce99ba4d 100644 > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -7040,9 +7040,8 @@ local0.*/var/log/postgresql > The name will be displayed in the > pg_stat_activity view and included in CSV log > entries. It can also be included in regular log entries via the linkend="guc-log-line-prefix"/> parameter. > -Only printable ASCII characters may be used in the > -application_name value. Other characters > will be > -replaced with question marks (?). > +Non-ASCII characters used in the > application_name > +will be replaced with hexadecimal strings. > > > > > Don't use the future tense to describe the system's behavior. Instead > of "will be" write "are". (Yes, this change would be an improvement > on the original text. We should fix it while we're working on it > and our attention is focused.) > > It is more accurate, if I understand the issue, to say that characters > are replaced with hexadecimal _representations_ of the input bytes. > Finally, it would be good to know what representation we're talking > about. Perhaps just give the \xhh example and say: the Postgres > C-style escaped hexadecimal byte value. And hyperlink to > https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-ST > RINGS-ESCAPE > > (The docbook would be, depending on text you want to link: > > C-style escaped hexadecimal > byte value. > > I think. You link to id="someidvalue" attribute values.) IIUC the word " Postgres" cannot be used in the doc. Based on your all comments, I changed as below. How do you think? ``` Characters that are not printable ASCII, like \x03, are replaced with the PostgreSQL C-style escaped hexadecimal byte value. ``` > @@ -8037,10 +8036,9 @@ COPY postgres_log FROM > '/full/path/to/logfile.csv' WITH csv; > The name can be any string of less > than NAMEDATALEN characters (64 characters > in > a standard > -build). Only printable ASCII characters may be used in the > -cluster_name value. Other characters will be > -replaced with question marks (?). No name > is shown > -if this parameter is set to the empty string > '' (which is > +build). Non-ASCII characters used in the > cluster_name > +will be replaced with hexadecimal strings. No name is shown if > this > +parameter is set to the empty string '' > (which is the default). This parameter can only be set at server start. > > > > Same review as for the first patch hunk. Fixed like above. You can refer the patch. > diff --git a/doc/src/sgml/postgres-fdw.sgml > b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644 > ---
Re: [PATCH] Add inline comments to the pg_hba_file_rules view
On 04.09.23 11:54, Jim Jones wrote: This patch proposes the column "comment" to the pg_hba_file_rules view. It basically parses the inline comment (if any) of a valid pg_hba.conf entry and displays it in the new column. For such pg_hba entries ... host db jim 127.0.0.1/32 md5 # foo host db jim 127.0.0.1/32 md5 #bar host db jim 127.0.0.1/32 md5 # #foo# I'm skeptical about this. First, there are multiple commenting styles. The end-of-line style is less common in my experience, because pg_hba.conf lines tend to belong. Another style is # foo host db jim 127.0.0.1/32 md5 # bar host db jim 127.0.0.1/32 md5 or even as a block # foo and bar host db jim 127.0.0.1/32 md5 host db jim 127.0.0.1/32 md5 Another potential problem is that maybe people don't want their comments leaked out of the file. Who knows what they have written in there. I think we should leave file comments be file comments. If we want some annotations to be exported to higher-level views, we should make that an intentional and explicit separate feature.
Re: Partial aggregates pushdown
fujii.y...@df.mitsubishielectric.co.jp писал 2023-09-25 06:18: Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers. Thank you for your valuable comments. I sincerely apologize for the very late reply. Here is a response to your comments or a fix to the patch. Tuesday, August 8, 2023 at 3:31 Bruce Momjian > I have modified the program except for the point "if the version of the remote server is less than PG17". > Instead, we have addressed the following. > "If check_partial_aggregate_support is true and the remote server version is older than the local server > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless > the partial aggregate function and the aggregate function match." > The reason for this is to maintain compatibility with any aggregate function that does not support partial > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate. > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation > in PG16. Just to clarify, I think you are saying: If check_partial_aggregate_support is true and the remote server version is older than the local server version, postgres_fdw checks if the partial aggregate function exists on the remote server during planning and only uses it if it does. I tried to phrase it in a positive way, and mentioned the plan time distinction. Also, I am sorry I was away for most of July and am just getting to this. Thanks for your comment. In the documentation, the description of check_partial_aggregate_support is as follows (please see postgres-fdw.sgml). -- check_partial_aggregate_support (boolean) Only if this option is true, during query planning, postgres_fdw connects to the remote server and check if the remote server version is older than the local server version. If so, postgres_fdw assumes that for each built-in aggregate function, the partial aggregate function is not defined on the remote server unless the partial aggregate function and the aggregate function match. The default is false. -- Thursday, 20 July 2023 19:23 Alexander Pyhalov : fujii.y...@df.mitsubishielectric.co.jp писал 2023-07-19 03:43: > Hi Mr.Pyhalov, hackers. > 3) > I modified the patch to safely do a partial aggregate pushdown for > queries which contain having clauses. > Hi. Sorry, but I don't see how it could work. We apologize for any inconvenience caused. Thanks to Pyhalov's and Jim's comments, I have realized that I have made a fundamental mistake regarding the pushdown of the HAVING clause and the difficulty of achieving it performing Partial aggregate pushdown. So, I removed the codes about pushdown of the HAVING clause performing Partial aggregate pushdown. Thursday, 20 July 2023 19:23 Alexander Pyhalov : As for changes in planner.c (setGroupClausePartial()) I have several questions. 1) Why don't we add non_group_exprs to pathtarget->exprs when partial_target->exprs is not set? 2) We replace extra->partial_target->exprs with partial_target->exprs after processing. Why are we sure that after this tleSortGroupRef is correct? Response to 1) The code you pointed out was unnecessary. I have removed this code. Also, the process of adding PlaceHolderVar's expr to partial_target was missing. So I fixed this. Response to 2) The making procedures extra->groupClausePartial and extra->partial_target in make_partial_grouping_target for this patch is as follows. STEP1. From grouping_target->exprs, extract Aggref, Var and Placeholdervar that are not included in Aggref. STEP2. setGroupClausePartial sets the copy of original groupClause to extra->groupClausePartial and sets the copy of original partial_target to extra->partial_target. STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to partial_target. The sortgroupref of partial_target->sortgrouprefs to be added to value is set to (the maximum value of the existing sortgroupref) + 1. setGroupClausePartial adds data sgc of sortgroupclause type where sgc->tlesortgroupref matches the sortgroupref to GroupClause. STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to partial_target. Due to STEP2, the list of tlesortgrouprefs set in extra->groupClausePartial is not duplicated. Do you mean that extra->partial_target->sortgrouprefs is not replaced, and so we preserve tlesortgroupref numbers? I'm suspicious about rewriting extra->partial_target->exprs with partial_target->exprs - I'm still not sure why we don't we loose information, added by add_column_to_pathtarget() to extra->partial_target->exprs? Also look at the following example. EXPLAIN VERBOSE SELECT count(*) , (b/2)::numeric FROM pagg_tab GROUP BY b/2 ORDER BY 1; QUERY PLAN --- Sort (cost=511.35..511.47 rows=50
Re: Correct the documentation for work_mem
On Tue, 12 Sept 2023 at 03:03, Bruce Momjian wrote: > > On Mon, Sep 11, 2023 at 10:02:55PM +1200, David Rowley wrote: > > It's certainly not a show-stopper. I do believe the patch makes some > > improvements. The reason I'd prefer to see either "and" or "and/or" > > in place of "or" is because the text is trying to imply that many of > > these operations can run at the same time. I'm struggling to > > understand why, given that there could be many sorts and many hashes > > going on at once that we'd claim it could only be one *or* the other. > > If we have 12 sorts and 4 hashes then that's not "several sort or hash > > operations", it's "several sort and hash operations". Of course, it > > could just be sorts or just hashes, so "and/or" works fine for that. > > Yes, I see your point and went with "and", updated patch attached. Looks good to me. David
[Code Cleanup] : Small code cleanup in twophase.sql
Hi, PFA small code cleanup in twophase.sql. Which contains a drop table statement for 'test_prepared_savepoint'. Which, to me, appears to be missing in the cleanup section of that file. To support it I have below points:- 1) Grepping this table 'test_prepared_savepoint' shows occurrences only in twophase.out & twophase.sql files. This means that table is local to that sql test file and not used in any other test file. 2) I don't see any comment on why this was not added in the cleanup section of twophase.sql, but drop for other two test tables are done. 3) I ran "make check-world" with the patch and I don't see any failures. Kindly correct, if I missed anything. Regards, Nishant (EDB). v1-0001-Small-code-cleanup-in-twophase.sql.patch Description: Binary data
Re: Add test module for Table Access Method
On Mon, Jun 5, 2023 at 1:24 PM Fabrízio de Royes Mello < fabriziome...@gmail.com> wrote: > > On Sat, Jun 3, 2023 at 7:42 PM Fabrízio de Royes Mello < fabriziome...@gmail.com> wrote: > > > > > > Hi all, > > > > During the PGCon Unconference session about Table Access Method one missing item pointed out is that currently we lack documentation and examples of TAM. > > > > So in order to improve things a bit in this area I'm proposing to add a test module for Table Access Method similar what we already have for Index Access Method. > > > > This code is based on the "blackhole_am" implemented by Michael Paquier: https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am > > > > Just added some more tests, ran pgindent and also organized a bit some comments and README.txt. > Rebased version. -- Fabrízio de Royes Mello From 5b6642b520874f4ca7023fc33d2e8e875fb64693 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fabr=C3=ADzio=20de=20Royes=20Mello?= Date: Sat, 3 Jun 2023 17:23:05 -0400 Subject: [PATCH v3] Add test module for Table Access Method --- src/test/modules/Makefile | 1 + src/test/modules/dummy_table_am/.gitignore| 3 + src/test/modules/dummy_table_am/Makefile | 20 + src/test/modules/dummy_table_am/README| 5 + .../dummy_table_am/dummy_table_am--1.0.sql| 14 + .../modules/dummy_table_am/dummy_table_am.c | 500 ++ .../dummy_table_am/dummy_table_am.control | 5 + .../expected/dummy_table_am.out | 207 src/test/modules/dummy_table_am/meson.build | 33 ++ .../dummy_table_am/sql/dummy_table_am.sql | 55 ++ src/test/modules/meson.build | 1 + 11 files changed, 844 insertions(+) create mode 100644 src/test/modules/dummy_table_am/.gitignore create mode 100644 src/test/modules/dummy_table_am/Makefile create mode 100644 src/test/modules/dummy_table_am/README create mode 100644 src/test/modules/dummy_table_am/dummy_table_am--1.0.sql create mode 100644 src/test/modules/dummy_table_am/dummy_table_am.c create mode 100644 src/test/modules/dummy_table_am/dummy_table_am.control create mode 100644 src/test/modules/dummy_table_am/expected/dummy_table_am.out create mode 100644 src/test/modules/dummy_table_am/meson.build create mode 100644 src/test/modules/dummy_table_am/sql/dummy_table_am.sql diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile index e81873cb5a..cb7a5b970a 100644 --- a/src/test/modules/Makefile +++ b/src/test/modules/Makefile @@ -10,6 +10,7 @@ SUBDIRS = \ delay_execution \ dummy_index_am \ dummy_seclabel \ + dummy_table_am \ libpq_pipeline \ plsample \ spgist_name_ops \ diff --git a/src/test/modules/dummy_table_am/.gitignore b/src/test/modules/dummy_table_am/.gitignore new file mode 100644 index 00..44d119cfcc --- /dev/null +++ b/src/test/modules/dummy_table_am/.gitignore @@ -0,0 +1,3 @@ +# Generated subdirectories +/log/ +/results/ diff --git a/src/test/modules/dummy_table_am/Makefile b/src/test/modules/dummy_table_am/Makefile new file mode 100644 index 00..9ea4a590c6 --- /dev/null +++ b/src/test/modules/dummy_table_am/Makefile @@ -0,0 +1,20 @@ +# src/test/modules/dummy_table_am/Makefile + +MODULES = dummy_table_am + +EXTENSION = dummy_table_am +DATA = dummy_table_am--1.0.sql +PGFILEDESC = "dummy_table_am - table access method template" + +REGRESS = dummy_table_am + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = src/test/modules/dummy_table_am +top_builddir = ../../../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/src/test/modules/dummy_table_am/README b/src/test/modules/dummy_table_am/README new file mode 100644 index 00..35211554b0 --- /dev/null +++ b/src/test/modules/dummy_table_am/README @@ -0,0 +1,5 @@ +Dummy Table AM +== + +Dummy table AM is a module for testing any facility usable by an table +access method, whose code is kept a maximum simple. diff --git a/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql b/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql new file mode 100644 index 00..aa0fd82e61 --- /dev/null +++ b/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql @@ -0,0 +1,14 @@ +/* src/test/modules/dummy_table_am/dummy_table_am--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION dummy_table_am" to load this file. \quit + +CREATE FUNCTION dummy_table_am_handler(internal) +RETURNS table_am_handler +AS 'MODULE_PATHNAME' +LANGUAGE C; + +-- Access method +CREATE ACCESS METHOD dummy_table_am TYPE TABLE HANDLER dummy_table_am_handler; +COMMENT ON ACCESS METHOD dummy_table_am IS 'dummy table access method'; + diff --git a/src/test/modules/dummy_table_am/dummy_table_am.c b/src/test/modules/dummy_table_am/dummy_table_am.c new file mode 100644 index
Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)
On Wed, 27 Sept 2023 at 01:31, Ranier Vilela wrote: > It seems to me that it adds a LEA instruction. > https://godbolt.org/z/b4jK3PErE There's a fairly significant difference in the optimisability of a comparison with a compile-time constant vs a variable. For example, would you expect the compiler to emit assembly for each side of the boolean AND in: if (a > 12 && a > 20), or how about if (a > 12 && a > y)? No need to answer. Just consider it. I suggest keeping your experiments as close to the target code as practical. You might be surprised by what the compiler can optimise. David
Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)
Em ter., 26 de set. de 2023 às 07:34, Ashutosh Bapat < ashutosh.bapat@gmail.com> escreveu: > On Tue, Sep 26, 2023 at 3:32 PM David Rowley wrote: > > > > find_base_rel() could be made more robust for free by just casting the > > relid and simple_rel_array_size to uint32 while checking that relid < > > root->simple_rel_array_size. The 0th element should be NULL anyway, > > so "if (rel)" should let relid==0 calls through and allow that to > > ERROR still. I see that just changes a "jle" to "jnb" vs adding an > > additional jump for Ranier's version. [1] > > That's a good suggestion. > > I am fine with find_base_rel() as it is today as well. But > future-proofing it seems to be fine too. > > > > > It seems worth not making find_base_rel() more expensive than it is > > today as commonly we just reference root->simple_rel_array[n] directly > > anyway because it's cheaper. It would be nice if we didn't add further > > overhead to find_base_rel() as this would make the case for using > > PlannerInfo.simple_rel_array directly even stronger. > > I am curious, is the overhead in find_base_rel() impacting overall > performance? > It seems to me that it adds a LEA instruction. https://godbolt.org/z/b4jK3PErE Although it doesn't seem like much, I believe the solution (casting to unsigned) seems better. So +1. best regards, Ranier Vilela
Re: Opportunistically pruning page before update
On Tue, Sep 5, 2023 at 1:40 PM Melanie Plageman wrote: > > On Wed, Jun 21, 2023 at 8:51 AM James Coleman wrote: > > While at PGCon I was chatting with Andres (and I think Peter G. and a > > few others who I can't remember at the moment, apologies) and Andres > > noted that while we opportunistically prune a page when inserting a > > tuple (before deciding we need a new page) we don't do the same for > > updates. > > > > Attached is a patch series to do the following: > > > > 0001: Make it possible to call heap_page_prune_opt already holding an > > exclusive lock on the buffer. > > 0002: Opportunistically prune pages on update when the current tuple's > > page has no free space. If this frees up enough space, then we > > continue to put the new tuple on that page; if not, then we take the > > existing code path and get a new page. > > I've reviewed these patches and have questions. > > Under what conditions would this be exercised for UPDATE? Could you > provide an example? > > With your patch applied, when I create a table, the first time I update > it heap_page_prune_opt() will return before actually doing any pruning > because the page prune_xid hadn't been set (it is set after pruning as > well as later in heap_update() after RelationGetBufferForTuple() is > called). > > I actually added an additional parameter to heap_page_prune() and > heap_page_prune_opt() to identify if heap_page_prune() was called from > RelationGetBufferForTuple() and logged a message when this was true. > Running the test suite, I didn't see any UPDATEs executing > heap_page_prune() from RelationGetBufferForTuple(). I did, however, see > other statement types doing so (see RelationGetBufferForTuple()'s other > callers). Was that intended? > > > I started to work on benchmarking this, but haven't had time to devote > > properly to that, so I'm wondering if there's anyone who might be > > interested in collaborating on that part. > > I'm interested in this feature and in helping with it/helping with > benchmarking it, but I don't yet understand the design in its current > form. Hi Melanie, Thanks for taking a look at this! Apologies for the long delay in replying: I started to take a look at your questions earlier, and it turned into more of a rabbit hole than I'd anticipated. I've since been distracted by other things. So -- I don't have any conclusions here yet, but I'm hoping at or after PGConf NYC that I'll be able to dedicate the time this deserves. Thanks, James Coleman
Re: [PoC] pg_upgrade: allow to upgrade publisher node
On Tue, Sep 26, 2023 at 10:51 AM Hayato Kuroda (Fujitsu) wrote: > > Again, thank you for reviewing! PSA a new version. Thanks for the new patch. Here's a comment on v46: 1. +Datum +binary_upgrade_validate_wal_logical_end(PG_FUNCTION_ARGS +{ oid => '8046', descr => 'for use by pg_upgrade', + proname => 'binary_upgrade_validate_wal_logical_end', proisstrict => 'f', + provolatile => 'v', proparallel => 'u', prorettype => 'bool', + proargtypes => 'pg_lsn', + prosrc => 'binary_upgrade_validate_wal_logical_end' }, I think this patch can avoid catalog changes by turning binary_upgrade_validate_wal_logical_end a FRONTEND-only function sitting in xlogreader.c after making InitXLogReaderState(), ReadNextXLogRecord() FRONTEND-friendly (replace elog/ereport with pg_fatal or such). With this change and back-porting of commit e0b2eed0 to save logical slots at shutdown, the patch can help support upgrading logical replication slots on PG versions < 17. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs
On 25.09.23 12:56, Nazir Bilal Yavuz wrote: + # Only run if a specific OS is not requested and if there are changes in docs + # or in the CI files. + skip: > +$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:.*' || +!changesInclude('doc/**', +'.cirrus.yml', +'.cirrus.tasks.yml', +'src/backend/catalog/sql_feature_packages.txt', +'src/backend/catalog/sql_features.txt', +'src/backend/utils/errcodes.txt', +'src/backend/utils/activity/wait_event_names.txt', +'src/backend/utils/activity/generate-wait_event_types.pl', +'src/include/parser/kwlist.h') This is kind of annoying. Now we need to maintain yet another list of these dependencies and keep it in sync with the build systems. I think meson can produce a dependency tree from a build. Maybe we could use that somehow and have Cirrus cache it between runs? Also note that there are also dependencies in the other direction. For example, the psql help is compiled from XML DocBook sources. So your other patch would also need to include similar changesInclude() clauses.
Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)
On Tue, Sep 26, 2023 at 3:32 PM David Rowley wrote: > > find_base_rel() could be made more robust for free by just casting the > relid and simple_rel_array_size to uint32 while checking that relid < > root->simple_rel_array_size. The 0th element should be NULL anyway, > so "if (rel)" should let relid==0 calls through and allow that to > ERROR still. I see that just changes a "jle" to "jnb" vs adding an > additional jump for Ranier's version. [1] That's a good suggestion. I am fine with find_base_rel() as it is today as well. But future-proofing it seems to be fine too. > > It seems worth not making find_base_rel() more expensive than it is > today as commonly we just reference root->simple_rel_array[n] directly > anyway because it's cheaper. It would be nice if we didn't add further > overhead to find_base_rel() as this would make the case for using > PlannerInfo.simple_rel_array directly even stronger. I am curious, is the overhead in find_base_rel() impacting overall performance? -- Best Wishes, Ashutosh Bapat
Re: Add const qualifiers
On 09.09.23 21:03, David Steele wrote: On 9/1/23 11:39, David Steele wrote: Hackers, I noticed that there was a mismatch between the const qualifiers for excludeDirContents in src/backend/backup/backup/basebackup.c and src/bin/pg_rewind/file_map.c and that led me to use ^static const.*\*.*= to do a quick search for similar cases. I think at the least we should make excludeDirContents match, but the rest of the changes seem like a good idea as well. Added to 2023-11 CF. committed
Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)
On Tue, 26 Sept 2023 at 21:45, Ashutosh Bapat wrote: > However, I agree that changing find_base_rel() the way you have done > in your patch is fine and mildly future-proof. +1 to that idea > irrespective of what bitmapset functions do. I'm not a fan of adding additional run-time overhead for this theoretical problem. find_base_rel() could be made more robust for free by just casting the relid and simple_rel_array_size to uint32 while checking that relid < root->simple_rel_array_size. The 0th element should be NULL anyway, so "if (rel)" should let relid==0 calls through and allow that to ERROR still. I see that just changes a "jle" to "jnb" vs adding an additional jump for Ranier's version. [1] It seems worth not making find_base_rel() more expensive than it is today as commonly we just reference root->simple_rel_array[n] directly anyway because it's cheaper. It would be nice if we didn't add further overhead to find_base_rel() as this would make the case for using PlannerInfo.simple_rel_array directly even stronger. David [1] https://godbolt.org/z/qrxKTbvva
Re: POC, WIP: OR-clause support for indexes
Sorry for the duplicates, I received a letter that my letter did not reach the addressee, I thought the design was incorrect. On 26.09.2023 12:21, a.rybakina wrote: I'm sorry I didn't write for a long time, but I really had a very difficult month, now I'm fully back to work. *I was able to implement the patches to the end and moved the transformation of "OR" expressions to ANY.* I haven't seen a big difference between them yet, one has a transformation before calculating selectivity (v7.1-Replace-OR-clause-to-ANY.patch), the other after (v7.2-Replace-OR-clause-to-ANY.patch). Regression tests are passing, I don't see any problems with selectivity, nothing has fallen into the coredump, but I found some incorrect transformations. What is the reason for these inaccuracies, I have not found, but, to be honest, they look unusual). Gave the error below. In the patch, I don't like that I had to drag three libraries from parsing until I found a way around it.The advantage of this approach compared to the other ([1]) is that at this stage all possible or transformations are performed, compared to the patch, where the transformation was done at the parsing stage. That is, here, for example, there are such optimizations in the transformation: I took the common element out of the bracket and the rest is converted to ANY, while, as noted by Peter Geoghegan, we did not have several bitmapscans, but only one scan through the array. postgres=# explain analyze SELECT p1.oid, p1.proname FROM pg_proc as p1 WHERE prolang = 13 AND prolang=1 OR prolang = 13 AND prolang = 2 OR prolang = 13 AND prolang = 3; QUERY PLAN --- Seq Scan on pg_proc p1 (cost=0.00..151.66 rows=1 width=68) (actual time=1.167..1.168 rows=0 loops=1) Filter: ((prolang = '13'::oid) AND (prolang = ANY (ARRAY['1'::oid, '2'::oid, '3'::oid]))) Rows Removed by Filter: 3302 Planning Time: 0.146 ms Execution Time: 1.191 ms (5 rows) *While I was testing, I found some transformations that don't work, although in my opinion, they should:** ** **1. First case:* explain analyze SELECT p1.oid, p1.proname FROM pg_proc as p1 WHERE prolang = 13 AND prolang=1 OR prolang = 2 AND prolang = 2 OR prolang = 13 AND prolang = 13; QUERY PLAN -- Seq Scan on pg_proc p1 (cost=0.00..180.55 rows=2 width=68) (actual time=2.959..3.335 rows=89 loops=1) Filter: (((prolang = '13'::oid) AND (prolang = '1'::oid)) OR ((prolang = '2'::oid) AND (prolang = '2'::oid)) OR ((prolang = '13'::oid) AND (prolang = '13'::oid))) Rows Removed by Filter: 3213 Planning Time: 1.278 ms Execution Time: 3.486 ms (5 rows) Should have left only prolang = '13'::oid: QUERY PLAN --- Seq Scan on pg_proc p1 (cost=0.00..139.28 rows=1 width=68) (actual time=2.034..2.034 rows=0 loops=1) Filter: ((prolang = '13'::oid )) Rows Removed by Filter: 3302 Planning Time: 0.181 ms Execution Time: 2.079 ms (5 rows) *2. Also does not work:* postgres=# explain analyze SELECT p1.oid, p1.proname FROM pg_proc as p1 WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13; QUERY PLAN --- Seq Scan on pg_proc p1 (cost=0.00..164.04 rows=176 width=68) (actual time=2.422..2.686 rows=89 loops=1) Filter: ((prolang = '13'::oid) OR ((prolang = '2'::oid) AND (prolang = '2'::oid)) OR (prolang = '13'::oid)) Rows Removed by Filter: 3213 Planning Time: 1.370 ms Execution Time: 2.799 ms (5 rows) Should have left: Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid)) *3. Or another:* explain analyze SELECT p1.oid, p1.proname FROM pg_proc as p1 WHERE prolang = 13 OR prolang=13 OR prolang = 2 AND prolang = 2; QUERY PLAN --- Seq Scan on pg_proc p1 (cost=0.00..164.04 rows=176 width=68) (actual time=2.350..2.566 rows=89 loops=1) Filter: ((prolang = '13'::oid) OR (prolang = '13'::oid) OR ((prolang = '2'::oid) AND (prolang = '2'::oid))) Rows Removed by Filter: 3213 Planning Time: 0.215 ms Execution Time: 2.624 ms (5 rows) Should have left: Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid)) *Falls into coredump at me:* explain analyze SELECT p1.oid, p1.proname FROM pg_proc as p1 WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13; explain analyze SELECT p1.oid,
Re: OpenSSL v3 performance regressions
> On 26 Sep 2023, at 10:36, Adrien Nayrat wrote: > Should Postgres support alternative SSL libraries has HAProxy? PostgreSQL can be built with LibreSSL instead of OpenSSL, which may or may not be a better option performance wise for a particular application. Benchmarking your workload is key to understanding performance, a lot of the regressions pointed to in that blogpost wont affect postgres. -- Daniel Gustafsson
Re: Move global variables of pgoutput to plugin private scope.
On Tue, Sep 19, 2023 at 12:48 PM Zhijie Hou (Fujitsu) wrote: > > - static bool publish_no_origin; > > This flag is also local to pgoutput instance, and we didn't reset the flag in > output shutdown callback, so if we consume changes from different slots, then > the second call would reuse the flag value that is set in the first call which > is unexpected. To completely avoid this issue, we think we'd better move this > flag to output plugin private data structure. > > Example: > SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_1', > NULL, NULL, 'proto_version', '1', 'publication_names', 'pub', 'origin', > 'none'); --- Set origin in this call. > SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_2', > NULL, NULL, 'proto_version', '1', 'publication_names', 'pub'); -- Didn't set > origin, but will reuse the origin flag in the first call. > char*origin; + bool publish_no_origin; } PGOutputData; Do we really need a new parameter in above structure? Can't we just use the existing origin in the same structure? Please remember if this needs to be backpatched then it may not be good idea to add new parameter in the structure but apart from that having two members to represent similar information doesn't seem advisable to me. I feel for backbranch we can just use PGOutputData->origin for comparison and for HEAD, we can remove origin and just use a boolean to avoid any extra cost for comparisions for each change. Can we add a test case to cover this case? -- With Regards, Amit Kapila.
OpenSSL v3 performance regressions
Hello, I read this article from Haproxy, they noticed OpenSSL v3 has huge performance regressions : https://github.com/haproxy/wiki/wiki/SSL-Libraries-Support-Status#openssl This is a known issue : https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659 Unfortunately, v3 is shipped with many distributions (Debian, Redhat, Rockylinux...) : https://pkgs.org/search/?q=openssl I am afraid of users will face performance issues once they update their distro. Does someone already encounter problems? Should Postgres support alternative SSL libraries has HAProxy? Regards, -- Adrien NAYRAT
Re: Regression in COPY FROM caused by 9f8377f7a2
Here is an improved version of the patch with regression tests. Yours, Laurenz Albe From 71744ada1e2c8cfdbb57e03018572a1af623b09e Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Tue, 26 Sep 2023 10:09:49 +0200 Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary Since commit 9f8377f7a2, we evaluate the column default values in COPY FROM for all columns except generated ones, because they could be needed if the input value matches the DEFAULT option. This can cause a surprising regression: CREATE TABLE boom (t character varying(5) DEFAULT 'a long string'); COPY boom FROM STDIN; ERROR: value too long for type character varying(5) This worked before 9f8377f7a2, since default values were only evaluated for columns that were not specified in the column list. To fix, fetch the default values only if the DEFAULT option was specified or for columns not specified in the column list. --- src/backend/commands/copyfrom.c| 9 - src/test/regress/expected/copy.out | 17 + src/test/regress/sql/copy.sql | 15 +++ 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c index 70871ed819..3f3e631dee 100644 --- a/src/backend/commands/copyfrom.c +++ b/src/backend/commands/copyfrom.c @@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate, /* Get default info if available */ defexprs[attnum - 1] = NULL; - if (!att->attgenerated) + /* + * We need the default values only for columns that do not appear in the + * column list. But if the DEFAULT option was given, we may need all + * column default values. We never need defaults for generated columns. + */ + if ((cstate->opts.default_print != NULL || + !list_member_int(cstate->attnumlist, attnum)) && + !att->attgenerated) { Expr *defexpr = (Expr *) build_column_default(cstate->rel, attnum); diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out index 8a8bf43fde..a5912c13a8 100644 --- a/src/test/regress/expected/copy.out +++ b/src/test/regress/expected/copy.out @@ -240,3 +240,20 @@ SELECT * FROM header_copytest ORDER BY a; (5 rows) drop table header_copytest; +-- test COPY with overlong column defaults +create temp table oversized_column_default ( +col1 varchar(5) DEFAULT 'more than 5 chars', +col2 varchar(5)); +-- normal COPY should work +copy oversized_column_default from stdin; +-- error if the column is excluded +copy oversized_column_default (col2) from stdin; +ERROR: value too long for type character varying(5) +\. +invalid command \. +-- error if the DEFAULT option is given +copy oversized_column_default from stdin (default ''); +ERROR: value too long for type character varying(5) +\. +invalid command \. +drop table oversized_column_default; diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql index f9da7b1508..7fdb26d14f 100644 --- a/src/test/regress/sql/copy.sql +++ b/src/test/regress/sql/copy.sql @@ -268,3 +268,18 @@ a c b SELECT * FROM header_copytest ORDER BY a; drop table header_copytest; + +-- test COPY with overlong column defaults +create temp table oversized_column_default ( +col1 varchar(5) DEFAULT 'more than 5 chars', +col2 varchar(5)); +-- normal COPY should work +copy oversized_column_default from stdin; +\. +-- error if the column is excluded +copy oversized_column_default (col2) from stdin; +\. +-- error if the DEFAULT option is given +copy oversized_column_default from stdin (default ''); +\. +drop table oversized_column_default; -- 2.41.0
Re: logfmt and application_context
Hi Daniel, Thanks for the feedback. Le mardi 05 septembre 2023 à 11:35 +0200, Daniel Gustafsson a écrit : > > On 30 Aug 2023, at 14:36, Étienne BERSAC wrote: > > > ..what do you think of having logfmt output along json and CSV ? > > Less ideal is > that there is no official formal definition of what logfmt is [...] If we add > support for it, can we reasonably expect that what we emit is what consumers > of > it assume it will look like? I didn't know logfmt had variation. Do you have a case of incompatibility ? Anyway, I think that logfmt will be better handled inside Postgres rather than in an extension due to limitation in syslogger extendability. I could send a patch if more people are interested in this. What do you think about application_context as a way to render e.g. a web request UUID to all backend log messages ? Regards, Étienne
Fail hard if xlogreader.c fails on out-of-memory
Hi all, (Thomas in CC.) Now that becfbdd6c1c9 has improved the situation to detect the difference between out-of-memory and invalid WAL data in WAL, I guess that it is time to tackle the problem of what we should do when reading WAL records bit fail on out-of-memory. To summarize, currently the WAL reader APIs fail the same way if we detect some incorrect WAL record or if a memory allocation fails: an error is generated and returned back to the caller to consume. For WAL replay, not being able to make the difference between an OOM and the end-of-wal is a problem in some cases. For example, in crash recovery, failing an internal allocation will be detected as the end-of-wal, causing recovery to stop prematurely. In the worst cases, this silently corrupts clusters because not all the records generated in the local pg_wal/ have been replayed. Oops. When in standby mode, things are a bit better, because we'd just loop and wait for the next record. But, even in this case, if the startup process does a crash recovery while standby is set up, we may finish by attempting recovery from a different source than the local pg_wal/. Not strictly critical, but less optimal in some cases as we could switch to archive recovery earlier than necessary. In a different thread, I have proposed to extend the WAL reader facility so as an error code is returned to make the difference between an OOM or the end of WAL with an incorrect record: https://www.postgresql.org/message-id/ZRJ-p1dLUY0uoChc%40paquier.xyz However this requires some ABI changes, so that's not backpatchable. This leaves out what we can do for the existing back-branches, and one option is to do the simplest thing I can think of: if an allocation fails, just fail *hard*. The allocations of the WAL reader rely on palloc_extended(), so I'd like to suggest that we switch to palloc() instead. If we do so, an ERROR is promoted to a FATAL during WAL replay, which makes sure that we will never stop recovery earlier than we should, FATAL-ing before things go wrong. Note that the WAL prefetching already relies on a palloc() on HEAD in XLogReadRecordAlloc(), which would fail hard the same way on OOM. So, attached is a proposal of patch to do something down to 12. Thoughts and/or comments are welcome. -- Michael diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index a17263df20..a1363e3b8f 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -43,7 +43,7 @@ static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3); -static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength); +static void allocate_recordbuf(XLogReaderState *state, uint32 reclength); static int ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen); static void XLogReaderInvalReadState(XLogReaderState *state); @@ -155,14 +155,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir, * Allocate an initial readRecordBuf of minimal size, which can later be * enlarged if necessary. */ - if (!allocate_recordbuf(state, 0)) - { - pfree(state->errormsg_buf); - pfree(state->readBuf); - pfree(state); - return NULL; - } - + allocate_recordbuf(state, 0); return state; } @@ -184,7 +177,6 @@ XLogReaderFree(XLogReaderState *state) /* * Allocate readRecordBuf to fit a record of at least the given length. - * Returns true if successful, false if out of memory. * * readRecordBufSize is set to the new buffer size. * @@ -196,7 +188,7 @@ XLogReaderFree(XLogReaderState *state) * Note: This routine should *never* be called for xl_tot_len until the header * of the record has been fully validated. */ -static bool +static void allocate_recordbuf(XLogReaderState *state, uint32 reclength) { uint32 newSize = reclength; @@ -206,15 +198,8 @@ allocate_recordbuf(XLogReaderState *state, uint32 reclength) if (state->readRecordBuf) pfree(state->readRecordBuf); - state->readRecordBuf = - (char *) palloc_extended(newSize, MCXT_ALLOC_NO_OOM); - if (state->readRecordBuf == NULL) - { - state->readRecordBufSize = 0; - return false; - } + state->readRecordBuf = (char *) palloc(newSize); state->readRecordBufSize = newSize; - return true; } /* @@ -505,9 +490,7 @@ XLogReadRecordAlloc(XLogReaderState *state, size_t xl_tot_len, bool allow_oversi /* Not enough space in the decode buffer. Are we allowed to allocate? */ if (allow_oversized) { - decoded = palloc_extended(required_space, MCXT_ALLOC_NO_OOM); - if (decoded == NULL) - return NULL; + decoded = palloc(required_space); decoded->oversized = true; return decoded; } @@ -815,13 +798,7 @@ restart: Assert(gotlen <= lengthof(save_copy)); Assert(gotlen <= state->readRecordBufSize); memcpy(save_copy, state->readRecordBuf, gotlen); -if (!allocate_recordbuf(state, total_len)) -{ - /* We treat this as
Re: Invalidate the subscription worker in cases where a user loses their superuser status
Here are some comments for patch v2-0001. == src/backend/replication/logical/worker.c 1. maybe_reread_subscription ereport(LOG, (errmsg("logical replication worker for subscription \"%s\" will restart because of a parameter change", MySubscription->name))); Is this really a "parameter change" though? It might be a stretch to call the user role change a subscription parameter change. Perhaps this case needs its own LOG message? == src/include/catalog/pg_subscription.h 2. char*origin; /* Only publish data originating from the * specified origin */ + bool isownersuperuser; /* Is subscription owner superuser? */ } Subscription; Is a new Subscription field member really necessary? The Subscription already has 'owner' -- why doesn't function maybe_reread_subscription() just check: (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner)) == src/test/subscription/t/027_nosuperuser.pl 3. # The apply worker should get restarted after the superuser prvileges are # revoked for subscription owner alice. typo /prvileges/privileges/ ~ 4. +# After the user becomes non-superuser the apply worker should be restarted and +# it should fail with 'password is required' error as password option is not +# part of the connection string. /as password option/because the password option/ == Kind Regards, Peter Smith. Fujitsu Australia.
Re: Regression in COPY FROM caused by 9f8377f7a2
On Mon, 2023-09-25 at 17:49 -0400, Tom Lane wrote: > Andrew Dunstan writes: > > On 2023-09-25 Mo 11:06, Andrew Dunstan wrote: > > > On 2023-09-25 Mo 04:59, Laurenz Albe wrote: > > > > CREATE TABLE boom (t character varying(5) DEFAULT 'a long string'); > > > Thinking about this a little more, wouldn't it be better if we checked > > at the time we set the default that the value is actually valid for the > > given column? This is only one manifestation of a problem you could run > > into given this table definition. > > I dunno, it seems at least possible that someone would do this > deliberately as a means of preventing the column from being defaulted. > In any case, the current behavior has stood for a very long time and > no one has complained that an error should be thrown sooner. Moreover, this makes restoring a pg_dump from v15 to v16 fail, which should never happen. This is how I got that bug report. Yours, Laurenz Albe
Re: Incorrect handling of OOM in WAL replay leading to data loss
On Thu, Aug 10, 2023 at 02:59:07PM +0900, Michael Paquier wrote: > My apologies if I sounded unclear here. It seems to me that we should > wrap the patch on [1] first, and get it backpatched. At least that > makes for less conflicts when 0001 gets merged for HEAD when we are > able to set a proper error code. (Was looking at it, actually.) Now that Thomas Munro has addressed the original problem to be able to trust correctly xl_tot_len with bae868caf22, I am coming back to this thread. First, attached is a rebased set: - 0001 to introduce the new error infra for xlogreader.c with an error code, so as callers can make the difference between an OOM and an invalid record. - 0002 to tweak the startup process. Once again, I've taken the approach to make the startup process behave like a standby on crash recovery: each time that an OOM is found, we loop and retry. - 0003 to emulate an OOM failure, that can be used with the script attached to see that we don't stop recovery too early. >>> I guess that it depends on how much responsiveness one may want. >>> Forcing a failure on OOM is at least something that users would be >>> immediately able to act on when we don't run a standby but just >>> recover from a crash, while a standby would do what it is designed to >>> do, aka continue to replay what it can see. One issue with the >>> wait-and-continue is that a standby may loop continuously on OOM, >>> which could be also bad if there's a replication slot retaining WAL on >>> the primary. Perhaps that's just OK to keep doing that for a >>> standby. At least this makes the discussion easier for the sake of >>> this thread: just consider the case of crash recovery when we don't >>> have a standby. >> >> Yeah, I'm with you on focusing on crash recovery cases; that's what I >> intended. Sorry for any confusion. > > Okay, so we're on the same page here, keeping standbys as they are and > do something for the crash recovery case. For the crash recovery case, one argument that stood out in my mind is that causing a hard failure has the disadvantage to force users to do again WAL replay from the last redo position, which may be far away even if the checkpointer now runs during crash recovery. What I am proposing on this thread has the merit to avoid that. Anyway, let's discuss more before settling this point for the crash recovery case. By the way, anything that I am proposing here cannot be backpatched because of the infrastructure changes required in walreader.c, so I am going to create a second thread with something that could be backpatched (yeah, likely FATALs on OOM to stop recovery from doing something bad).. -- Michael From 3d7bb24fc2f9f070273b63208819c4e54e428d18 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 26 Sep 2023 15:40:05 +0900 Subject: [PATCH v4 1/3] Add infrastructure to report error codes in WAL reader This commits moves the error state coming from WAL readers into a new structure, that includes the existing pointer to the error message buffer, but it also gains an error code that fed back to the callers of the following routines: XLogPrefetcherReadRecord() XLogReadRecord() XLogNextRecord() DecodeXLogRecord() This will help in improving the decisions to take during recovery depending on the failure more reported. --- src/include/access/xlogprefetcher.h | 2 +- src/include/access/xlogreader.h | 33 +++- src/backend/access/transam/twophase.c | 8 +- src/backend/access/transam/xlog.c | 6 +- src/backend/access/transam/xlogprefetcher.c | 4 +- src/backend/access/transam/xlogreader.c | 170 -- src/backend/access/transam/xlogrecovery.c | 14 +- src/backend/access/transam/xlogutils.c| 2 +- src/backend/replication/logical/logical.c | 9 +- .../replication/logical/logicalfuncs.c| 9 +- src/backend/replication/slotfuncs.c | 8 +- src/backend/replication/walsender.c | 8 +- src/bin/pg_rewind/parsexlog.c | 24 +-- src/bin/pg_waldump/pg_waldump.c | 10 +- contrib/pg_walinspect/pg_walinspect.c | 11 +- src/tools/pgindent/typedefs.list | 2 + 16 files changed, 201 insertions(+), 119 deletions(-) diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h index 7dd7f20ad0..5563ad1a67 100644 --- a/src/include/access/xlogprefetcher.h +++ b/src/include/access/xlogprefetcher.h @@ -48,7 +48,7 @@ extern void XLogPrefetcherBeginRead(XLogPrefetcher *prefetcher, XLogRecPtr recPtr); extern XLogRecord *XLogPrefetcherReadRecord(XLogPrefetcher *prefetcher, - char **errmsg); + XLogReaderError *errordata); extern void XLogPrefetcherComputeStats(XLogPrefetcher *prefetcher); diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index da32c7db77..06664dc6fb 100644 --- a/src/include/access/xlogreader.h +++
Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set
> On 26 Sep 2023, at 00:20, Nathan Bossart wrote: > > On Thu, Sep 21, 2023 at 11:18:00AM +0900, bt23nguyent wrote: >> -basic_archive_configured(ArchiveModuleState *state) >> +basic_archive_configured(ArchiveModuleState *state, char **logdetail) > > Could we do something more like GUC_check_errdetail() instead to maintain > backward compatibility with v16? We'd still need something exported to call into which isn't in 16, so it wouldn't be more than optically backwards compatible since a module written for 17 won't compile for 16, or am I missing something? -- Daniel Gustafsson
Re: [PGdocs] fix description for handling pf non-ASCII characters
Hello Hayato and Jian, On Tue, 4 Jul 2023 01:30:56 + "Hayato Kuroda (Fujitsu)" wrote: > Dear Jian, > > looks fine. Do you need to add to commitfest? > > Thank you for your confirmation. ! I registered to following: > > https://commitfest.postgresql.org/44/4437/ The way the Postgres commitfest process works is that someone has to update the page to mark "reviewed" and the reviewer has to use the commitfest website to pass the patches to a committer. I see a few problems with the English and style of the patches and am commenting below and have signed up as a reviewer. At commitfest.postgresql.org I have marked the thread as needing author attention. Hayato, you will need to mark the thread as needing review when you have replied to this message. Jian, you might want to sign on as a reviewer as well. It can be nice to have that record of your participation. Now, to reviewing the patch: First, it is now best practice in the PG docs to put a line break at the end of each sentence. At least for the sentences on the lines you change. (No need to update the whole document!) Please do this in the next version of your patch. I don't know if this is a requirement for acceptance by a committer, but it won't hurt. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e700782d3c..a4ce99ba4d 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7040,9 +7040,8 @@ local0.*/var/log/postgresql The name will be displayed in the pg_stat_activity view and included in CSV log entries. It can also be included in regular log entries via the parameter. -Only printable ASCII characters may be used in the -application_name value. Other characters will be -replaced with question marks (?). +Non-ASCII characters used in the application_name +will be replaced with hexadecimal strings. Don't use the future tense to describe the system's behavior. Instead of "will be" write "are". (Yes, this change would be an improvement on the original text. We should fix it while we're working on it and our attention is focused.) It is more accurate, if I understand the issue, to say that characters are replaced with hexadecimal _representations_ of the input bytes. Finally, it would be good to know what representation we're talking about. Perhaps just give the \xhh example and say: the Postgres C-style escaped hexadecimal byte value. And hyperlink to https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE (The docbook would be, depending on text you want to link: C-style escaped hexadecimal byte value. I think. You link to id="someidvalue" attribute values.) @@ -8037,10 +8036,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; The name can be any string of less than NAMEDATALEN characters (64 characters in a standard -build). Only printable ASCII characters may be used in the -cluster_name value. Other characters will be -replaced with question marks (?). No name is shown -if this parameter is set to the empty string '' (which is +build). Non-ASCII characters used in the cluster_name +will be replaced with hexadecimal strings. No name is shown if this +parameter is set to the empty string '' (which is the default). This parameter can only be set at server start. Same review as for the first patch hunk. diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -1067,9 +1067,8 @@ postgres=# SELECT postgres_fdw_disconnect_all(); of any length and contain even non-ASCII characters. However when it's passed to and used as application_name in a foreign server, note that it will be truncated to less than - NAMEDATALEN characters and anything other than - printable ASCII characters will be replaced with question - marks (?). + NAMEDATALEN characters and non-ASCII characters will be + replaced with hexadecimal strings. See for details. Same review as for the first patch hunk. Since the both of you have looked and confirmed that the actual behavior matches the new documentation I have not done this. But, have either of you checked that we're really talking about replacing everything outside the 7-bit ASCII encodings? My reading of the commit referenced in the first email of this thread says that it's everything outside of the _printable_ ASCII encodings, ASCII values outside of the range 32 to 127, inclusive. Please check. The docs should probably say "printable ASCII", or "non-printable ASCII", depending. I think the meaning of "printable ASCII" is widely enough known to be 32-127. So "printable" is good enough, right? Regards, Karl Free Software: "You don't pay back, you pay