Re: Synchronizing slots from primary to standby
On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy wrote: > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > wrote: > > > > [2] The steps to reproduce the data miss issue on a primary->standby setup: > > I'm trying to reproduce the problem with [1], but I can see the > changes after the standby is promoted. Am I missing anything here? > > ubuntu:~/postgres/pg17/bin$ ./psql -d postgres -p 5433 -c "select * > from pg_logical_slot_get_changes('lrep_sync_slot', NULL, NULL);" > lsn| xid |data > ---+-+- > 0/3B0 | 738 | BEGIN 738 > 0/3017FC8 | 738 | COMMIT 738 > 0/3017FF8 | 739 | BEGIN 739 > 0/3019A38 | 739 | COMMIT 739 > 0/3019A38 | 740 | BEGIN 740 > 0/3019A38 | 740 | table public.dummy1: INSERT: a[integer]:999 > 0/3019AA8 | 740 | COMMIT 740 > (7 rows) > > [1] > -#define LOG_SNAPSHOT_INTERVAL_MS 15000 > +#define LOG_SNAPSHOT_INTERVAL_MS 150 > > ./initdb -D db17 > echo "archive_mode = on > archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f' > wal_level='logical' > autovacuum = off > checkpoint_timeout='1h'" | tee -a db17/postgresql.conf > > ./pg_ctl -D db17 -l logfile17 start > > rm -rf sbdata logfilesbdata > ./pg_basebackup -D sbdata > > ./psql -d postgres -p 5432 -c "SELECT > pg_create_logical_replication_slot('lrep_sync_slot', 'test_decoding', > false, false, true);" > ./psql -d postgres -p 5432 -c "SELECT > pg_create_physical_replication_slot('phy_repl_slot', true, false);" > > echo "port=5433 > primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu' > primary_slot_name='phy_repl_slot' > restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f %p' > hot_standby_feedback=on > sync_replication_slots=on" | tee -a sbdata/postgresql.conf > > touch sbdata/standby.signal > > ./pg_ctl -D sbdata -l logfilesbdata start > ./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();" > > ./psql -d postgres > > SESSION1, TXN1 > BEGIN; > create table dummy1(a int); > > SESSION2, TXN2 > SELECT pg_log_standby_snapshot(); > > SESSION1, TXN1 > COMMIT; > > SESSION1, TXN1 > BEGIN; > create table dummy2(a int); > > SESSION2, TXN2 > SELECT pg_log_standby_snapshot(); > > SESSION1, TXN1 > COMMIT; > > ./psql -d postgres -p 5432 -c "SELECT > pg_replication_slot_advance('lrep_sync_slot', pg_current_wal_lsn());" > After this step and before the next, did you ensure that the slot sync has synced the latest confirmed_flush/restart LSNs? You can query: "select slot_name,restart_lsn, confirmed_flush_lsn from pg_replication_slots;" to ensure the same on both the primary and standby. -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) wrote: > > [2] The steps to reproduce the data miss issue on a primary->standby setup: I'm trying to reproduce the problem with [1], but I can see the changes after the standby is promoted. Am I missing anything here? ubuntu:~/postgres/pg17/bin$ ./psql -d postgres -p 5433 -c "select * from pg_logical_slot_get_changes('lrep_sync_slot', NULL, NULL);" lsn| xid |data ---+-+- 0/3B0 | 738 | BEGIN 738 0/3017FC8 | 738 | COMMIT 738 0/3017FF8 | 739 | BEGIN 739 0/3019A38 | 739 | COMMIT 739 0/3019A38 | 740 | BEGIN 740 0/3019A38 | 740 | table public.dummy1: INSERT: a[integer]:999 0/3019AA8 | 740 | COMMIT 740 (7 rows) [1] -#define LOG_SNAPSHOT_INTERVAL_MS 15000 +#define LOG_SNAPSHOT_INTERVAL_MS 150 ./initdb -D db17 echo "archive_mode = on archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f' wal_level='logical' autovacuum = off checkpoint_timeout='1h'" | tee -a db17/postgresql.conf ./pg_ctl -D db17 -l logfile17 start rm -rf sbdata logfilesbdata ./pg_basebackup -D sbdata ./psql -d postgres -p 5432 -c "SELECT pg_create_logical_replication_slot('lrep_sync_slot', 'test_decoding', false, false, true);" ./psql -d postgres -p 5432 -c "SELECT pg_create_physical_replication_slot('phy_repl_slot', true, false);" echo "port=5433 primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu' primary_slot_name='phy_repl_slot' restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f %p' hot_standby_feedback=on sync_replication_slots=on" | tee -a sbdata/postgresql.conf touch sbdata/standby.signal ./pg_ctl -D sbdata -l logfilesbdata start ./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();" ./psql -d postgres SESSION1, TXN1 BEGIN; create table dummy1(a int); SESSION2, TXN2 SELECT pg_log_standby_snapshot(); SESSION1, TXN1 COMMIT; SESSION1, TXN1 BEGIN; create table dummy2(a int); SESSION2, TXN2 SELECT pg_log_standby_snapshot(); SESSION1, TXN1 COMMIT; ./psql -d postgres -p 5432 -c "SELECT pg_replication_slot_advance('lrep_sync_slot', pg_current_wal_lsn());" ./psql -d postgres -p 5432 -c "INSERT INTO dummy1 VALUES(999);" ./psql -d postgres -p 5433 -c "SELECT pg_promote();" ./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();" ./psql -d postgres -p 5433 -c "select * from pg_logical_slot_get_changes('lrep_sync_slot', NULL, NULL);" -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: remaining sql/json patches
typedef struct JsonTableExecContext { int magic; JsonTablePlanState *rootplanstate; JsonTablePlanState **colexprplans; } JsonTableExecContext; imho, this kind of naming is kind of inconsistent. "state" and "plan" are mixed together. maybe typedef struct JsonTableExecContext { int magic; JsonTablePlanState *rootplanstate; JsonTablePlanState **colexprstates; } JsonTableExecContext; + cxt->colexprplans = palloc(sizeof(JsonTablePlanState *) * + list_length(tf->colvalexprs)); + /* Initialize plan */ - cxt->rootplanstate = JsonTableInitPlan(cxt, rootplan, args, + cxt->rootplanstate = JsonTableInitPlan(cxt, (Node *) rootplan, NULL, args, CurrentMemoryContext); I think, the comments "Initialize plan" is not right, here we initialize the rootplanstate (JsonTablePlanState) and also for each (no ordinality) columns, we also initialized the specific JsonTablePlanState. static void JsonTableRescan(JsonTablePlanState *planstate); @@ -331,6 +354,9 @@ static Datum JsonTableGetValue(TableFuncScanState *state, int colnum, Oid typid, int32 typmod, bool *isnull); static void JsonTableDestroyOpaque(TableFuncScanState *state); static bool JsonTablePlanNextRow(JsonTablePlanState *planstate); +static bool JsonTablePlanPathNextRow(JsonTablePlanState *planstate); +static void JsonTableRescan(JsonTablePlanState *planstate); JsonTableRescan included twice?
Re: Improve eviction algorithm in ReorderBuffer
On Mon, Apr 1, 2024 at 11:26 AM Masahiko Sawada wrote: > > On Fri, Mar 29, 2024 at 7:37 PM Amit Kapila wrote: > > > > On Fri, Mar 29, 2024 at 12:13 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Mar 29, 2024 at 2:09 PM vignesh C wrote: > > > > > > > > On Tue, 26 Mar 2024 at 10:05, Masahiko Sawada > > > > wrote: > > > > > > > > > > On Thu, Mar 14, 2024 at 12:02 PM Masahiko Sawada > > > > > wrote: > > > > > > > > > > > > > > > > > > I've attached new version patches. > > > > > > > > > > Since the previous patch conflicts with the current HEAD, I've > > > > > attached the rebased patches. > > > > > > > > Thanks for the updated patch. > > > > One comment: > > > > I felt we can mention the improvement where we update memory > > > > accounting info at transaction level instead of per change level which > > > > is done in ReorderBufferCleanupTXN, ReorderBufferTruncateTXN, and > > > > ReorderBufferSerializeTXN also in the commit message: > > > > > > Agreed. > > > > > > I think the patch is in good shape. I'll push the patch with the > > > suggestion next week, barring any objections. > > > > > > > Few minor comments: > > 1. > > @@ -3636,6 +3801,8 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb) > > Assert(txn->nentries_mem == 0); > > } > > > > + ReorderBufferMaybeResetMaxHeap(rb); > > + > > > > Can we write a comment about why this reset is required here? > > Otherwise, the reason is not apparent. > > Yes, added. > > > > > 2. > > Although using max-heap to select the largest > > + * transaction is effective when there are many transactions being decoded, > > + * there is generally no need to use it as long as all transactions being > > + * decoded are top-level transactions. Therefore, we use MaxConnections as > > the > > + * threshold so we can prevent switching to the state unless we use > > + * subtransactions. > > + */ > > +#define MAX_HEAP_TXN_COUNT_THRESHOLD MaxConnections > > > > Isn't using max-heap equally effective in finding the largest > > transaction whether there are top-level or top-level plus > > subtransactions? This comment indicates it is only effective when > > there are subtransactions. > > You're right. Updated the comment. > > I've attached the updated patches. > While reviewing the patches, I realized the comment of binearyheap_allocate() should also be updated. So I've attached the new patches. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com v12-0001-Make-binaryheap-enlargeable.patch Description: Binary data v12-0002-Add-functions-to-binaryheap-for-efficient-key-re.patch Description: Binary data v12-0003-Improve-eviction-algorithm-in-Reorderbuffer-usin.patch Description: Binary data
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot wrote: > > On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > > wrote: > > > > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > > > > > Commit message states: "why we can't just update inactive_since for > > > > synced slots on the standby with the value received from remote slot > > > > on the primary. This is consistent with any other slot parameter i.e. > > > > all of them are synced from the primary." > > > > > > > > The inactive_since is not consistent with other slot parameters which > > > > we copy. We don't perform anything related to those other parameters > > > > like say two_phase phase which can change that property. However, we > > > > do acquire the slot, advance the slot (as per recent discussion [1]), > > > > and release it. Since these operations can impact inactive_since, it > > > > seems to me that inactive_since is not the same as other parameters. > > > > It can have a different value than the primary. Why would anyone want > > > > to know the value of inactive_since from primary after the standby is > > > > promoted? > > > > > > I think it can be useful "before" it is promoted and in case the primary > > > is down. > > > > > > > It is not clear to me what is user going to do by checking the > > inactivity time for slots when the corresponding server is down. > > Say a failover needs to be done, then it could be useful to know for which > slots the activity needs to be resumed (thinking about external logical > decoding > plugin, not about pub/sub here). If one see an inactive slot (since long > "enough") > then he can start to reasonate about what to do with it. > > > I thought the idea was to check such slots and see if they need to be > > dropped or enabled again to avoid excessive disk usage, etc. > > Yeah that's the case but it does not mean inactive_since can't be useful in > other > ways. > > Also, say the slot has been invalidated on the primary (due to inactivity > timeout), > primary is down and there is a failover. By keeping the inactive_since from > the primary, one could know when the inactivity that lead to the timeout > started. > So, this means at promotion, we won't set the current_time for inactive_since which is not what the currently proposed patch is doing. Moreover, doing the invalidation on promoted standby based on inactive_since of the primary node sounds debatable because the inactive_timeout could be different on the new node (promoted standby). > Again, more concerned about external logical decoding plugin than pub/sub > here. > > > > I agree that tracking the activity time of a synced slot can be useful, > > > why > > > not creating a dedicated field for that purpose (and keep inactive_since a > > > perfect "copy" of the primary)? > > > > > > > We can have a separate field for this but not sure if it is worth it. > > OTOH I'm not sure that erasing this information from the primary is useful. I > think that 2 fields would be the best option and would be less subject of > misinterpretation. > > > > > Now, the other concern is that calling GetCurrentTimestamp() > > > > could be costly when the values for the slot are not going to be > > > > updated but if that happens we can optimize such that before acquiring > > > > the slot we can have some minimal pre-checks to ensure whether we need > > > > to update the slot or not. > > > > > > Right, but for a very active slot it is likely that we call > > > GetCurrentTimestamp() > > > during almost each sync cycle. > > > > > > > True, but if we have to save a slot to disk each time to persist the > > changes (for an active slot) then probably GetCurrentTimestamp() > > shouldn't be costly enough to matter. > > Right, persisting the changes to disk would be even more costly. > The point I was making is that currently after copying the remote_node's values, we always persist the slots to disk, so the cost of current_time shouldn't be much. Now, if the values won't change then probably there is some cost but in most cases (active slots), the values will always change. Also, if all the slots are inactive then we will slow down the speed of sync. We also need to consider if we want to copy the value of inactive_since from the primary and if that is the only value changed then shall we persist the slot or not? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila wrote: > > Commit message states: "why we can't just update inactive_since for > synced slots on the standby with the value received from remote slot > on the primary. This is consistent with any other slot parameter i.e. > all of them are synced from the primary." > > The inactive_since is not consistent with other slot parameters which > we copy. We don't perform anything related to those other parameters > like say two_phase phase which can change that property. However, we > do acquire the slot, advance the slot (as per recent discussion [1]), > and release it. Since these operations can impact inactive_since, it > seems to me that inactive_since is not the same as other parameters. > It can have a different value than the primary. Why would anyone want > to know the value of inactive_since from primary after the standby is > promoted? After thinking about it for a while now, it feels to me that the synced slots (slots on the standby that are being synced from the primary) can have their own inactive_sicne value. Fundamentally, inactive_sicne is set to 0 when slot is acquired and set to current time when slot is released, no matter who acquires and releases it - be it walsenders for replication, or backends for slot advance, or backends for slot sync using pg_sync_replication_slots, or backends for other slot functions, or background sync worker. Remember the earlier patch was updating inactive_since just for walsenders, but then the suggestion was to update it unconditionally - https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com. Whoever syncs the slot, *acutally* acquires the slot i.e. makes it theirs, syncs it from the primary, and releases it. IMO, no differentiation is to be made for synced slots. There was a suggestion on using inactive_since of the synced slot on the standby to know the inactivity of the slot on the primary. If one wants to do that, they better look at/monitor the primary slot info/logs/pg_replication_slot/whatever. I really don't see a point in having two different meanings for a single property of a replication slot - inactive_since for a regular slot tells since when this slot has become inactive, and for a synced slot since when the corresponding remote slot has become inactive. I think this will confuse users for sure. Also, if inactive_since is being changed on the primary so frequently, and none of the other parameters are changing, if we copy inactive_since to the synced slots, then standby will just be doing *sync* work (mark the slots dirty and save to disk) for updating inactive_since. I think this is unnecessary behaviour for sure. Coming to a future patch for inactive timeout based slot invalidation, we can either allow invalidation without any differentiation for synced slots or restrict invalidation to avoid more sync work. For instance, if inactive timeout is kept low on the standby, the sync worker will be doing more work as it drops and recreates a slot repeatedly if it keeps getting invalidated. Another thing is that the standby takes independent invalidation decisions for synced slots. AFAICS, invalidation due to wal_removal is the only sole reason (out of all available invalidation reasons) for a synced slot to get invalidated independently of the primary. Check https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com for the suggestion on we better not differentiaing invalidation decisions for synced slots. The assumption of letting synced slots have their own inactive_since not only simplifies the code, but also looks less-confusing and more meaningful to the user. The only code that we put in on top of the committed code is to use InRecovery in place of RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised by Shveta upthread. > Now, the other concern is that calling GetCurrentTimestamp() > could be costly when the values for the slot are not going to be > updated but if that happens we can optimize such that before acquiring > the slot we can have some minimal pre-checks to ensure whether we need > to update the slot or not. > > [1] - > https://www.postgresql.org/message-id/OS0PR01MB571615D35F486080616CA841943A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com A quick test with a function to measure the cost of GetCurrentTimestamp [1] on my Ubuntu dev system (an AWS EC2 c5.4xlarge instance), gives me [2]. It took 0.388 ms, 2.269 ms, 21.144 ms, 209.333 ms, 2091.174 ms, 20908.942 ms for 10K, 100K, 1million, 10million, 100million, 1billion times respectively. Costs might be different on various systems with different OS, but it gives us a rough idea. If we are too much concerned about the cost of GetCurrentTimestamp(), a possible approach is just don't set inactive_since for slots being synced on the standby. Just let the first acquisition and release after the promotion do that job. We
Re: [PoC] Improve dead tuple storage for lazy vacuum
On Fri, Mar 29, 2024 at 4:21 PM John Naylor wrote: > > On Thu, Mar 28, 2024 at 12:55 PM Masahiko Sawada > wrote: > > I think the patch is in good shape. Do you have other comments or > > suggestions, John? > > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -1918,11 +1918,6 @@ include_dir 'conf.d' > too high. It may be useful to control for this by separately > setting . > > - > -Note that for the collection of dead tuple identifiers, > -VACUUM is only able to utilize up to a maximum of > -1GB of memory. > - > > > > This is mentioned twice for two different GUCs -- need to remove the > other one, too. Good catch, removed. > Other than that, I just have minor nits: > > - * The major space usage for vacuuming is storage for the array of dead TIDs > + * The major space usage for vacuuming is TID store, a storage for dead TIDs > > I think I've helped edit this sentence before, but I still don't quite > like it. I'm thinking now "is storage for the dead tuple IDs". > > - * set upper bounds on the number of TIDs we can keep track of at once. > + * set upper bounds on the maximum memory that can be used for keeping track > + * of dead TIDs at once. > > I think "maximum" is redundant with "upper bounds". Fixed. > > I also feel the commit message needs more "meat" -- we need to clearly > narrate the features and benefits. I've attached how I would write it, > but feel free to use what you like to match your taste. Well, that's much better than mine. > > I've marked it Ready for Committer. Thank you! I've attached the patch that I'm going to push tomorrow. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com v82-0001-Use-TidStore-for-dead-tuple-TIDs-storage-during-.patch Description: Binary data
Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
On Fri, Mar 29, 2024 at 11:54 AM torikoshia wrote: > > On 2024-03-28 21:54, Masahiko Sawada wrote: > > On Thu, Mar 28, 2024 at 9:38 PM torikoshia > > wrote: > >> > >> On 2024-03-28 10:20, Masahiko Sawada wrote: > >> > Hi, > >> > > >> > On Thu, Jan 18, 2024 at 5:33 PM Masahiko Sawada > >> > wrote: > >> >> > >> >> On Thu, Jan 18, 2024 at 4:59 PM Alexander Korotkov > >> >> wrote: > >> >> > > >> >> > On Thu, Jan 18, 2024 at 4:16 AM torikoshia > >> >> > wrote: > >> >> > > On 2024-01-18 10:10, jian he wrote: > >> >> > > > On Thu, Jan 18, 2024 at 8:57 AM Masahiko Sawada > >> >> > > > > >> >> > > > wrote: > >> >> > > >> On Thu, Jan 18, 2024 at 6:38 AM Tom Lane > >> >> > > >> wrote: > >> >> > > >> > Kyotaro-san's suggestion isn't bad, though I might shorten it > >> >> > > >> > to > >> >> > > >> > error_action {error|ignore|log} (or perhaps "stop" instead of > >> >> > > >> > "error")? > >> >> > > >> > You will need a separate parameter anyway to specify the > >> >> > > >> > destination > >> >> > > >> > of "log", unless "none" became an illegal table name when I > >> >> > > >> > wasn't > >> >> > > >> > looking. I don't buy that one parameter that has some special > >> >> > > >> > values > >> >> > > >> > while other values could be names will be a good design. > >> >> > > >> > Moreover, > >> >> > > >> > what if we want to support (say) log-to-file along with > >> >> > > >> > log-to-table? > >> >> > > >> > Trying to distinguish a file name from a table name without > >> >> > > >> > any other > >> >> > > >> > context seems impossible. > >> >> > > >> > >> >> > > >> I've been thinking we can add more values to this option to log > >> >> > > >> errors > >> >> > > >> not only to the server logs but also to the error table (not sure > >> >> > > >> details but I imagined an error table is created for each table > >> >> > > >> on > >> >> > > >> error), without an additional option for the destination name. > >> >> > > >> The > >> >> > > >> values would be like error_action > >> >> > > >> {error|ignore|save-logs|save-table}. > >> >> > > >> > >> >> > > > > >> >> > > > another idea: > >> >> > > > on_error {error|ignore|other_future_option} > >> >> > > > if not specified then by default ERROR. > >> >> > > > You can also specify ERROR or IGNORE for now. > >> >> > > > > >> >> > > > I agree, the parameter "error_action" is better than "location". > >> >> > > > >> >> > > I'm not sure whether error_action or on_error is better, but either > >> >> > > way > >> >> > > "error_action error" and "on_error error" seems a bit odd to me. > >> >> > > I feel "stop" is better for both cases as Tom suggested. > >> >> > > >> >> > OK. What about this? > >> >> > on_error {stop|ignore|other_future_option} > >> >> > where other_future_option might be compound like "file 'copy.log'" or > >> >> > "table 'copy_log'". > >> >> > >> >> +1 > >> >> > >> > > >> > I realized that ON_ERROR syntax synoposis in the documentation is not > >> > correct. The option doesn't require the value to be quoted and the > >> > value can be omitted. The attached patch fixes it. > >> > > >> > Regards, > >> > >> Thanks! > >> > >> Attached patch fixes the doc, but I'm wondering perhaps it might be > >> better to modify the codes to prohibit abbreviation of the value. > >> > >> When seeing the query which abbreviates ON_ERROR value, I feel it's > >> not > >> obvious what happens compared to other options which tolerates > >> abbreviation of the value such as FREEZE or HEADER. > >> > >>COPY t1 FROM stdin WITH (ON_ERROR); > >> > >> What do you think? > > > > Indeed. Looking at options of other commands such as VACUUM and > > EXPLAIN, I can see that we can omit a boolean value, but non-boolean > > parameters require its value. The HEADER option is not a pure boolean > > parameter but we can omit the value. It seems to be for backward > > compatibility; it used to be a boolean parameter. I agree that the > > above example would confuse users. > > > > Regards, > > Thanks for your comment! > > Attached a patch which modifies the code to prohibit omission of its > value. > > I was a little unsure about adding a regression test for this, but I > have not added it since other COPY option doesn't test the omission of > its value. Probably should we change the doc as well since ON_ERROR value doesn't necessarily need to be single-quoted? The rest looks good to me. Alexander, what do you think about this change as you're the committer of this feature? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Improve eviction algorithm in ReorderBuffer
On Fri, Mar 29, 2024 at 8:48 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Sawada-san, > > > Agreed. > > > > I think the patch is in good shape. I'll push the patch with the > > suggestion next week, barring any objections. > > Thanks for working on this. Agreed it is committable. > Few minor comments: Thank you for the comments! > > ``` > + * Either txn or change must be non-NULL at least. We update the memory > + * counter of txn if it's non-NULL, otherwise change->txn. > ``` > > IIUC no one checks the restriction. Should we add Assert() for it, e.g,: > Assert(txn || change)? Agreed to add it. > > ``` > +/* make sure enough space for a new node */ > ... > +/* make sure enough space for a new node */ > ``` > > Should be started with upper case? I don't think we need to change it. There are other comments in the same file that are one line and start with lowercase. I've just submitted the updated patches[1] Regards, [1] https://www.postgresql.org/message-id/CAD21AoA6%3D%2BtL%3DbtB_s9N%2BcZK7tKz1W%3DPQyNq72nzjUcdyE%2BwZw%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Improve eviction algorithm in ReorderBuffer
On Fri, Mar 29, 2024 at 7:37 PM Amit Kapila wrote: > > On Fri, Mar 29, 2024 at 12:13 PM Masahiko Sawada > wrote: > > > > On Fri, Mar 29, 2024 at 2:09 PM vignesh C wrote: > > > > > > On Tue, 26 Mar 2024 at 10:05, Masahiko Sawada > > > wrote: > > > > > > > > On Thu, Mar 14, 2024 at 12:02 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > > > > > > I've attached new version patches. > > > > > > > > Since the previous patch conflicts with the current HEAD, I've > > > > attached the rebased patches. > > > > > > Thanks for the updated patch. > > > One comment: > > > I felt we can mention the improvement where we update memory > > > accounting info at transaction level instead of per change level which > > > is done in ReorderBufferCleanupTXN, ReorderBufferTruncateTXN, and > > > ReorderBufferSerializeTXN also in the commit message: > > > > Agreed. > > > > I think the patch is in good shape. I'll push the patch with the > > suggestion next week, barring any objections. > > > > Few minor comments: > 1. > @@ -3636,6 +3801,8 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb) > Assert(txn->nentries_mem == 0); > } > > + ReorderBufferMaybeResetMaxHeap(rb); > + > > Can we write a comment about why this reset is required here? > Otherwise, the reason is not apparent. Yes, added. > > 2. > Although using max-heap to select the largest > + * transaction is effective when there are many transactions being decoded, > + * there is generally no need to use it as long as all transactions being > + * decoded are top-level transactions. Therefore, we use MaxConnections as > the > + * threshold so we can prevent switching to the state unless we use > + * subtransactions. > + */ > +#define MAX_HEAP_TXN_COUNT_THRESHOLD MaxConnections > > Isn't using max-heap equally effective in finding the largest > transaction whether there are top-level or top-level plus > subtransactions? This comment indicates it is only effective when > there are subtransactions. You're right. Updated the comment. I've attached the updated patches. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com v11-0002-Add-functions-to-binaryheap-for-efficient-key-re.patch Description: Binary data v11-0001-Make-binaryheap-enlargeable.patch Description: Binary data v11-0003-Improve-eviction-algorithm-in-Reorderbuffer-usin.patch Description: Binary data
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Mon, Apr 1, 2024 at 5:54 AM Alexander Korotkov wrote: > > > 9. To me the following query blocks even though I didn't mention timeout. > > CALL pg_wal_replay_wait('0/fff'); > > If your primary server is freshly initialized, you need to do quite > data modifications to reach this LSN. Right, but why pg_wal_replay_wait blocks without a timeout? It must return an error saying it can't reach the target LSN, no? Did you forget to attach the new patch? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Extract numeric filed in JSONB more effectively
Here is latest version, nothing changed besides the rebase to the latest master. The most recent 3 questions should be addressed. - The error message compatible issue [1] and the Peter's answer at [2]. - Peter's new question at [2] and my answer at [3]. Any effrot to move this patch ahead is welcome and thanks all the people who provided high quaility feedback so far, especially chapman! [1] https://www.postgresql.org/message-id/87r0hmvuvr@163.com [2] https://www.postgresql.org/message-id/8102ff5b-b156-409e-a48f-e53e63a39b36%40eisentraut.org [3] https://www.postgresql.org/message-id/8734t6c5rh.fsf%40163.com -- Best Regards Andy Fan >From fb38be5addb93d7c0b8c1a3e8376751c9b3be5f5 Mon Sep 17 00:00:00 2001 From: "yizhi.fzh" Date: Mon, 1 Apr 2024 09:36:08 +0800 Subject: [PATCH v17 1/1] Improve the performance of Jsonb numeric/bool extraction. JSONB object uses a binary compatible numeric format with the numeric data type in SQL. However in the past, extracting a numeric value from a JSONB field still needs to find the corresponding JsonbValue first, then convert the JsonbValue to Jsonb, and finally use the cast system to convert the Jsonb to a Numeric data type. This approach was very inefficient in terms of performance. In the current patch, It is handled that the JsonbValue is converted to numeric data type directly. This is done by the planner support function which detects the above case and simplify the expression. Because the boolean type and numeric type share certain similarities in their attributes, we have implemented the same optimization approach for both. In the ideal test case, the performance can be 2x than before. The optimized functions and operators includes: 1. jsonb_object_field / -> 2. jsonb_array_element / -> 3. jsonb_extract_path / #> 4. jsonb_path_query 5. jsonb_path_query_first --- src/backend/utils/adt/jsonb.c | 206 ++ src/backend/utils/adt/jsonbsubs.c | 4 +- src/backend/utils/adt/jsonfuncs.c | 123 ++- src/backend/utils/adt/jsonpath_exec.c | 32 +++- src/include/catalog/pg_proc.dat | 46 +- src/include/utils/jsonb.h | 11 +- src/test/regress/expected/jsonb.out | 112 +- src/test/regress/sql/jsonb.sql| 66 - src/tools/pgindent/typedefs.list | 1 + 9 files changed, 542 insertions(+), 59 deletions(-) diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c index a5e48744ac..6e93b34fd6 100644 --- a/src/backend/utils/adt/jsonb.c +++ b/src/backend/utils/adt/jsonb.c @@ -16,9 +16,15 @@ #include "catalog/pg_proc.h" #include "catalog/pg_type.h" #include "funcapi.h" +#include "nodes/makefuncs.h" +#include "nodes/supportnodes.h" +#include "parser/parse_coerce.h" #include "libpq/pqformat.h" #include "miscadmin.h" #include "utils/builtins.h" +#include "utils/date.h" +#include "utils/datetime.h" +#include "utils/fmgroids.h" #include "utils/json.h" #include "utils/jsonb.h" #include "utils/jsonfuncs.h" @@ -2035,6 +2041,206 @@ cannotCastJsonbValue(enum jbvType type, const char *sqltype) elog(ERROR, "unknown jsonb type: %d", (int) type); } + +/* + * jsonb_cast_support() + * + * Planner support function for extracting numeric or bool data type more + * effectively. After finding out the corresponding JsonbValue, instead of + * casting it to Jsonb as an intermediate type, we covert it to the desired + * data type directly. + */ +Datum +jsonb_cast_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *fexpr = req->fcall; + FuncExpr *jsonb_start_func = NULL, + *jsonb_finish_func = NULL, + *final_func = NULL; + Node *input; + Oid new_func_id = InvalidOid; + List *args; + Oid input_func_id, + collid, + inputcollid; + bool retset = false, + variadic = false; + + Assert(list_length(fexpr->args) == 1); + input = (Node *) linitial(fexpr->args); + + if (IsA(input, OpExpr)) + { + OpExpr *opExpr = castNode(OpExpr, input); + + input_func_id = opExpr->opfuncid; + collid = opExpr->opcollid; + inputcollid = opExpr->inputcollid; + args = opExpr->args; + } + else if (IsA(input, FuncExpr)) + { + FuncExpr *funcExpr = castNode(FuncExpr, input); + + input_func_id = funcExpr->funcid; + collid = funcExpr->funccollid; + inputcollid = funcExpr->inputcollid; + args = funcExpr->args; + } + else + /* not the desired pattern. */ + PG_RETURN_POINTER(NULL); + + /* build a function to return the JsonbValue directly. */ + switch (input_func_id) + { + case F_JSONB_OBJECT_FIELD: +new_func_id = F_JSONB_OBJECT_FIELD_START; +break; + case F_JSONB_ARRAY_ELEMENT: +new_func_id = F_JSONB_ARRAY_ELEMENT_START; +break; + case F_JSONB_EXTRACT_PATH: +new_func_id = F_JSONB_EXTRACT_PATH_START; +variadic = true; +
Re: Statistics Import and Export
> > IIRC, "variadic any" requires having at least one variadic parameter. > But that seems fine --- what would be the point, or even the > semantics, of calling pg_set_attribute_stats with no data fields? > If my pg_dump run emitted a bunch of stats that could never be imported, I'd want to know. With silent failures, I don't. > Perhaps we could > invent a new backend function that extracts the actual element type > of a non-null anyarray argument. > A backend function that we can't guarantee exists on the source system. :( > Another way we could get to no-coercions is to stick with your > signature but declare the relevant parameters as anyarray instead of > text. I still think though that we'd be better off to leave the > parameter matching to runtime, so that we-don't-recognize-that-field > can be a warning not an error. > I'm a bit confused here. AFAIK we can't construct an anyarray in SQL: # select '{1,2,3}'::anyarray; ERROR: cannot accept a value of type anyarray > I think you missed my point: you're doing that inefficiently, > and maybe even with race conditions. Use the relcache's copy > of the pg_class row. > Roger Wilco. > Well, I'm here to debate it if you want, but I'll just note that *one* > error will be enough to abort a pg_upgrade entirely, and most users > these days get scared by errors during manual dump/restore too. So we > had better not be throwing errors except for cases that we don't think > pg_dump could ever emit. > That's pretty persuasive. It also means that we need to trap for error in the array_in() calls, as that function does not yet have a _safe() mode.
Re: Popcount optimization using AVX512
On Sat, Mar 30, 2024 at 03:03:29PM -0500, Nathan Bossart wrote: > My current plan is to add some new tests for > pg_popcount() with many bytes, and then I'll give it a few more days for > any additional feedback before committing. Here is a v18 with a couple of new tests. Otherwise, it is the same as v17. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 86a571721ed3ed4ca7e04134b9541fc3ac43b9f1 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 27 Mar 2024 16:39:24 -0500 Subject: [PATCH v18 1/1] AVX512 popcount support --- config/c-compiler.m4 | 58 ++ configure| 252 +++ configure.ac | 51 ++ meson.build | 87 + src/Makefile.global.in | 5 + src/include/pg_config.h.in | 12 ++ src/include/port/pg_bitutils.h | 15 ++ src/makefiles/meson.build| 4 +- src/port/Makefile| 11 ++ src/port/meson.build | 6 +- src/port/pg_bitutils.c | 56 +++--- src/port/pg_popcount_avx512.c| 49 ++ src/port/pg_popcount_avx512_choose.c | 71 src/test/regress/expected/bit.out| 24 +++ src/test/regress/sql/bit.sql | 4 + 15 files changed, 666 insertions(+), 39 deletions(-) create mode 100644 src/port/pg_popcount_avx512.c create mode 100644 src/port/pg_popcount_avx512_choose.c diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index 3268a780bb..5fb60775ca 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -694,3 +694,61 @@ if test x"$Ac_cachevar" = x"yes"; then fi undefine([Ac_cachevar])dnl ])# PGAC_LOONGARCH_CRC32C_INTRINSICS + +# PGAC_XSAVE_INTRINSICS +# - +# Check if the compiler supports the XSAVE instructions using the _xgetbv +# intrinsic function. +# +# An optional compiler flag can be passed as argument (e.g., -mxsave). If the +# intrinsic is supported, sets pgac_xsave_intrinsics and CFLAGS_XSAVE. +AC_DEFUN([PGAC_XSAVE_INTRINSICS], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics_$1])])dnl +AC_CACHE_CHECK([for _xgetbv with CFLAGS=$1], [Ac_cachevar], +[pgac_save_CFLAGS=$CFLAGS +CFLAGS="$pgac_save_CFLAGS $1" +AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], + [return _xgetbv(0) & 0xe0;])], + [Ac_cachevar=yes], + [Ac_cachevar=no]) +CFLAGS="$pgac_save_CFLAGS"]) +if test x"$Ac_cachevar" = x"yes"; then + CFLAGS_XSAVE="$1" + pgac_xsave_intrinsics=yes +fi +undefine([Ac_cachevar])dnl +])# PGAC_XSAVE_INTRINSICS + +# PGAC_AVX512_POPCNT_INTRINSICS +# - +# Check if the compiler supports the AVX512 POPCNT instructions using the +# _mm512_setzero_si512, _mm512_loadu_si512, _mm512_popcnt_epi64, +# _mm512_add_epi64, and _mm512_reduce_add_epi64 intrinsic functions. +# +# An optional compiler flag can be passed as argument +# (e.g., -mavx512vpopcntdq). If the intrinsics are supported, sets +# pgac_avx512_popcnt_intrinsics and CFLAGS_POPCNT. +AC_DEFUN([PGAC_AVX512_POPCNT_INTRINSICS], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics_$1])])dnl +AC_CACHE_CHECK([for _mm512_popcnt_epi64 with CFLAGS=$1], [Ac_cachevar], +[pgac_save_CFLAGS=$CFLAGS +CFLAGS="$pgac_save_CFLAGS $1" +AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], + [const char buf@<:@sizeof(__m512i)@:>@; + PG_INT64_TYPE popcnt = 0; + __m512i accum = _mm512_setzero_si512(); + const __m512i val = _mm512_loadu_si512((const __m512i *) buf); + const __m512i cnt = _mm512_popcnt_epi64(val); + accum = _mm512_add_epi64(accum, cnt); + popcnt = _mm512_reduce_add_epi64(accum); + /* return computed value, to prevent the above being optimized away */ + return popcnt == 0;])], + [Ac_cachevar=yes], + [Ac_cachevar=no]) +CFLAGS="$pgac_save_CFLAGS"]) +if test x"$Ac_cachevar" = x"yes"; then + CFLAGS_POPCNT="$1" + pgac_avx512_popcnt_intrinsics=yes +fi +undefine([Ac_cachevar])dnl +])# PGAC_AVX512_POPCNT_INTRINSICS diff --git a/configure b/configure index 36feeafbb2..b48ed7f271 100755 --- a/configure +++ b/configure @@ -647,6 +647,9 @@ MSGFMT_FLAGS MSGFMT PG_CRC32C_OBJS CFLAGS_CRC +PG_POPCNT_OBJS +CFLAGS_POPCNT +CFLAGS_XSAVE LIBOBJS OPENSSL ZSTD @@ -17404,6 +17407,40 @@ $as_echo "#define HAVE__GET_CPUID 1" >>confdefs.h fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __get_cpuid_count" >&5 +$as_echo_n "checking for __get_cpuid_count... " >&6; } +if ${pgac_cv__get_cpuid_count+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include +int +main () +{ +unsigned int exx[4] = {0, 0, 0, 0}; + __get_cpuid_count(7, 0, [0], [1], [2], [3]); + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + pgac_cv__get_cpuid_count="yes" +else + pgac_cv__get_cpuid_count="no" +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext conftest.$ac_ext +fi +{ $as_echo
Re: Add new error_action COPY ON_ERROR "log"
On Sat, Mar 30, 2024 at 11:05 PM Bharath Rupireddy wrote: > > On Thu, Mar 28, 2024 at 6:31 PM Masahiko Sawada wrote: > > > > That is, > > since the LOG_VERBOSITY option is an enum parameter, it might make > > more sense to require the value, instead of making the value optional. > > For example, the following command could not be obvious for users: > > > > COPY test FROM stdin (ON_ERROR ignore, LOG_VERBOSITY); > > Agreed. Please see the attached v14 patch. Thank you for updating the patch! > The LOG_VERBOSITY now needs > a value to be specified. Note that I've not added any test for this > case as there seemed to be no such tests so far generating "ERROR: > <> requires a parameter". I don't mind adding one for > LOG_VERBOSITY though. +1 One minor point: ENCODING 'encoding_name' +LOG_VERBOSITY [ mode ] '[' and ']' are not necessary because the value is no longer optional. I've attached the updated patch. I'll push it, barring any objections. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com v15-0001-Add-new-COPY-option-LOG_VERBOSITY.patch Description: Binary data
RE: Synchronizing slots from primary to standby
On Friday, March 29, 2024 2:50 PM Amit Kapila wrote: > > On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) > wrote: > > > > > > Attach a new version patch which fixed an un-initialized variable > > issue and added some comments. > > > > The other approach to fix this issue could be that the slotsync worker get the > serialized snapshot using pg_read_binary_file() corresponding to restart_lsn > and writes those at standby. But there are cases when we won't have such a > file > like (a) when we initially create the slot and reach the consistent_point, or > (b) > also by the time the slotsync worker starts to read the remote snapshot file, > the > snapshot file could have been removed by the checkpointer on the primary (if > the restart_lsn of the remote has been advanced in this window). So, in such > cases, we anyway need to advance the slot. I think these could be > optimizations > that we could do in the future. > > Few comments: Thanks for the comments. > = > 1. > - if (slot->data.database != MyDatabaseId) > + if (slot->data.database != MyDatabaseId && !fast_forward) > ereport(ERROR, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("replication slot \"%s\" was not created in this database", @@ -526,7 > +527,7 @@ CreateDecodingContext(XLogRecPtr start_lsn, > * Do not allow consumption of a "synchronized" slot until the standby > * gets promoted. > */ > - if (RecoveryInProgress() && slot->data.synced) > + if (RecoveryInProgress() && slot->data.synced && > + !IsSyncingReplicationSlots()) > > > Add comments at both of the above places. Added. > > > 2. > +extern XLogRecPtr pg_logical_replication_slot_advance(XLogRecPtr moveto, > + bool *found_consistent_point); > + > > This API looks a bit awkward as the functionality doesn't match the name. How > about having a function with name > LogicalSlotAdvanceAndCheckReadynessForDecoding(moveto, > ready_for_decoding) with the same functionality as your patch has for > pg_logical_replication_slot_advance() and then invoke it both from > pg_logical_replication_slot_advance and slotsync.c. The function name is too > big, we can think of a shorter name. Any ideas? How about LogicalSlotAdvanceAndCheckDecodingState() Or just LogicalSlotAdvanceAndCheckDecoding()? (I used the suggested LogicalSlotAdvanceAndCheckReadynessForDecoding in this version, It can be renamed in next version if we agree). Attach the V3 patch which addressed above comments and Kuroda-san's comments[1]. I also adjusted the tap-test to only check the confirmed_flush_lsn after syncing, as the restart_lsn could be different from the remote one due to the new slot_advance() call. I am also testing some optimization idea locally and will share if ready. [1] https://www.postgresql.org/message-id/TYCPR01MB1207757BB2A32B6815CE1CCE7F53A2%40TYCPR01MB12077.jpnprd01.prod.outlook.com Best Regards, Hou zj v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch Description: v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch
Cant link libpq with a pg extension using cmake
i am getting linking issues when trying to link libpq with my pg extension and i am using pg's libpq ,so libpq is built along with pg,so i did this in my extension's cmakelists.txt file (GLOB storage_SRC CONFIGURE_DEPENDS "*.cpp" ) add_library(storage OBJECT ${storage_SRC}) target_link_libraries(storage PRIVATE pq) btw included all required include dirs in my toplevel cmakelists.txt then i got undefined symbol: pqsecure_write but don't know why if i give the pg_config --libdir/libpq.a path in target_link_libraries instead of lib name then it links but walreceiver process cant start and i get FATAL: could not connect to the primary server: libpq is incorrectly linked to backend functions
Re: Statistics Import and Export
Corey Huinker writes: >> and I really think that we need to provide >> the source server's major version number --- maybe we will never >> need that, but if we do and we don't have it we will be sad. > The JSON had it, and I never did use it. Not against having it again. Well, you don't need it now seeing that the definition of pg_stats columns hasn't changed in the past ... but there's no guarantee we won't want to change them in the future. >> So this leads me to suggest that we'd be best off with a VARIADIC >> ANY signature, where the variadic part consists of alternating >> parameter labels and values: >> pg_set_attribute_stats(table regclass, attribute name, >>inherited bool, source_version int, >>variadic "any") returns void > I'm not aware of how strict works with variadics. Would the lack of any > variadic parameters trigger it? IIRC, "variadic any" requires having at least one variadic parameter. But that seems fine --- what would be the point, or even the semantics, of calling pg_set_attribute_stats with no data fields? > Also going with strict means that an inadvertent explicit NULL in one > parameter would cause the entire attribute import to fail silently. I'd > rather fail loudly. Not really convinced that that is worth any trouble... > * We can require the calling statement to cast arguments, particularly >> arrays, to the proper type, removing the need for conversions within >> the stats-setting function. (But instead, it'd need to check that the >> next "any" argument is the type it ought to be based on the type of >> the target column.) > So, that's tricky. The type of the values is not always the attribute type, Hmm. You would need to have enough smarts in pg_set_attribute_stats to identify the appropriate array type in any case: as coded, it needs that for coercion, whereas what I'm suggesting would only require it for checking, but either way you need it. I do concede that pg_dump (or other logic generating the calls) needs to know more under my proposal than before. I had been thinking that it would not need to hard-code that because it could look to see what the actual type is of the array it's dumping. However, I see that pg_typeof() doesn't work for that because it just returns anyarray. Perhaps we could invent a new backend function that extracts the actual element type of a non-null anyarray argument. Another way we could get to no-coercions is to stick with your signature but declare the relevant parameters as anyarray instead of text. I still think though that we'd be better off to leave the parameter matching to runtime, so that we-don't-recognize-that-field can be a warning not an error. >> * why is check_relation_permissions looking up the pg_class row? >> There's already a copy of that in the Relation struct. > To prove that the caller is the owner (or better) of the table. I think you missed my point: you're doing that inefficiently, and maybe even with race conditions. Use the relcache's copy of the pg_class row. >> * I'm dubious that we can fully vet the contents of these arrays, >> and even a little dubious that we need to try. > A lot of the feedback I got on this patch over the months concerned giving > inaccurate, nonsensical, or malicious data to the planner. Surely the > planner does do *some* defensive programming when fetching these values, > but this is the first time those values were potentially set by a user, not > by our own internal code. We can try to match types, collations, etc from > source to dest, but even that would fall victim to another glibc-level > collation change. That sort of concern is exactly why I think the planner has to, and does, defend itself. Even if you fully vet the data at the instant of loading, we might have the collation change under us later. It could be argued that feeding bogus data to the planner for testing purposes is a valid use-case for this feature. (Of course, as superuser we could inject bogus data into pg_statistic manually, so it's not necessary to have this feature for that purpose.) I guess I'm a great deal more sanguine than other people about the planner's ability to tolerate inconsistent data; but in any case I don't have a lot of faith in relying on checks in pg_set_attribute_stats to substitute for that ability. That idea mainly leads to having a whole lot of code that has to be kept in sync with other code that's far away from it and probably isn't coded in a parallel fashion either. >> * There's a lot of ERROR cases that maybe we ought to downgrade >> to WARN-and-press-on, in the service of not breaking the restore >> completely in case of trouble. > All cases were made error precisely to spark debate about which cases we'd > want to continue from and which we'd want to error from. Well, I'm here to debate it if you want, but I'll just note that *one* error will be enough to abort a pg_upgrade entirely, and most users these
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hi Bharath, Thank you for your feedback. On Sun, Mar 31, 2024 at 8:44 AM Bharath Rupireddy wrote: > On Sun, Mar 31, 2024 at 7:41 AM Alexander Korotkov > wrote: > Thanks for the patch. I have a few comments on the v16 patch. > > 1. Is there any specific reason for pg_wal_replay_wait() being a > procedure rather than a function? I haven't read the full thread, but > I vaguely noticed the discussion on the new wait mechanism holding up > a snapshot or some other resource. Is that the reason to use a stored > procedure over a function? If yes, can we specify it somewhere in the > commit message and just before the procedure definition in > system_functions.sql? Surely, there is a reason. Function should be executed in a snapshot, which can prevent WAL records from being replayed. See [1] for a particular test scenario. In a procedure we may enforce non-atomic context and release the snapshot. I've mentioned that in the commit message and in the procedure code. I don't think system_functions.sql is the place for this type of comment. We only use system_functions.sql to push the default values. > 2. Is the pg_wal_replay_wait first procedure that postgres provides > out of the box? Yes, it appears first. I see nothing wrong about that. > 3. Defining a procedure for the first time in system_functions.sql > which is supposed to be for functions seems a bit unusual to me. >From the scope of DDL and system catalogue procedure is just another kind of function (prokind == 'p'). So, I don't feel wrong about that. > 4. > + > +endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), timeout); > + > > +if (timeout > 0) > +{ > +delay_ms = (endtime - GetCurrentTimestamp()) / 1000; > +latch_events |= WL_TIMEOUT; > +if (delay_ms <= 0) > +break; > +} > > Why is endtime calculated even for timeout <= 0 only to just skip it > later? Can't we just do a fastpath exit if timeout = 0 and targetLSN < OK, fixed. > 5. > Parameter > +timeout is the time in milliseconds to wait > +for the target_lsn > +replay. When timeout value equals to zero no > +timeout is applied. > +replay. When timeout value equals to zero no > +timeout is applied. > > It turns out to be "When timeout value equals to zero no timeout is > applied." I guess, we can just say something like the following which > I picked up from pg_terminate_backend timeout parameter description. > > > If timeout is not specified or zero, this > function returns if the WAL upto > target_lsn is replayed. > If the timeout is specified (in > milliseconds) and greater than zero, the function waits until the > server actually replays the WAL upto target_lsn or > until the given time has passed. On timeout, an error is emitted. > Applied as you suggested with some edits from me. > 6. > +ereport(ERROR, > +(errcode(ERRCODE_QUERY_CANCELED), > + errmsg("canceling waiting for LSN due to timeout"))); > > We can be a bit more informative here and say targetLSN and currentLSN > something like - "timed out while waiting for target LSN %X/%X to be > replayed; current LSN %X/%X"? Done this way. Adjusted other ereport()'s as well. > 7. > +if (context->atomic) > +ereport(ERROR, > +(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("pg_wal_replay_wait() must be only called in > non-atomic context"))); > + > > Can we say what a "non-atomic context '' is in a user understandable > way like explicit txn or whatever that might be? "non-atomic context' > might not sound great to the end -user. Added errdetail() to this ereport(). > 8. > +the movie table and get the lsn > after > +changes just made. This example uses > pg_current_wal_insert_lsn > +to get the lsn given that > synchronous_commit > +could be set to off. > > Can we just mention that run pg_current_wal_insert_lsn on the primary? The mention is added. > 9. To me the following query blocks even though I didn't mention timeout. > CALL pg_wal_replay_wait('0/fff'); If your primary server is freshly initialized, you need to do quite data modifications to reach this LSN. > 10. Can't we do some input validation on the timeout parameter and > emit an error for negative values just like pg_terminate_backend? > CALL pg_wal_replay_wait('0/ff', -100); Reasonable, added. > 11. > + > +if (timeout > 0) > +{ > +delay_ms = (endtime - GetCurrentTimestamp()) / 1000; > +latch_events |= WL_TIMEOUT; > +if (delay_ms <= 0) > +break; > +} > + > > Can we avoid calling GetCurrentTimestamp in a for loop which can be > costly at times especially when pg_wal_replay_wait is called with > larger timeouts on multiple backends? Can't we reuse >
Re: Statistics Import and Export
> > > I concur with the plan of extracting data from pg_stats not > pg_statistics, and with emitting a single "set statistics" > call per attribute. (I think at one point I'd suggested a call > per stakind slot, but that would lead to a bunch of UPDATEs on > existing pg_attribute tuples and hence a bunch of dead tuples > at the end of an import, so it's not the way to go. A series > of UPDATEs would likely also play poorly with a background > auto-ANALYZE happening concurrently.) > That was my reasoning as well. > I do not like the current design for pg_set_attribute_stats' API > though: I don't think it's at all future-proof. What happens when > somebody adds a new stakind (and hence new pg_stats column)? > You could try to add an overloaded pg_set_attribute_stats > version with more parameters, but I'm pretty sure that would > lead to "ambiguous function call" failures when trying to load > old dump files containing only the original parameters. I don't think we'd overload, we'd just add new parameters to the function signature. > The > present design is also fragile in that an unrecognized parameter > will lead to a parse-time failure and no function call happening, > which is less robust than I'd like. There was a lot of back-and-forth about what sorts of failures were error-worthy, and which were warn-worthy. I'll discuss further below. > As lesser points, > the relation argument ought to be declared regclass not oid for > convenience of use, +1 > and I really think that we need to provide > the source server's major version number --- maybe we will never > need that, but if we do and we don't have it we will be sad. > The JSON had it, and I never did use it. Not against having it again. > > So this leads me to suggest that we'd be best off with a VARIADIC > ANY signature, where the variadic part consists of alternating > parameter labels and values: > > pg_set_attribute_stats(table regclass, attribute name, >inherited bool, source_version int, >variadic "any") returns void > > where a call might look like > > SELECT pg_set_attribute_stats('public.mytable', 'mycolumn', > false, -- not inherited > 16, -- source server major version > -- pairs of labels and values follow > 'null_frac', 0.4, > 'avg_width', 42, > 'histogram_bounds', > array['a', 'b', 'c']::text[], > ...); > > Note a couple of useful things here: > > * AFAICS we could label the function strict and remove all those ad-hoc > null checks. If you don't have a value for a particular stat, you > just leave that pair of arguments out (exactly as the existing code > in 0002 does, just using a different notation). This also means that > we don't need any default arguments and so no need for hackery in > system_functions.sql. > I'm not aware of how strict works with variadics. Would the lack of any variadic parameters trigger it? Also going with strict means that an inadvertent explicit NULL in one parameter would cause the entire attribute import to fail silently. I'd rather fail loudly. > * If we don't recognize a parameter label at runtime, we can treat > that as a warning rather than a hard error, and press on. This case > would mostly be useful in major version downgrades I suppose, but > that will be something people will want eventually. > Interesting. * We can require the calling statement to cast arguments, particularly > arrays, to the proper type, removing the need for conversions within > the stats-setting function. (But instead, it'd need to check that the > next "any" argument is the type it ought to be based on the type of > the target column.) > So, that's tricky. The type of the values is not always the attribute type, for expression indexes, we do call exprType() and exprCollation(), in which case we always use the expression type over the attribute type, but only use the collation type if the attribute had no collation. This mimics the behavior of ANALYZE. Then, for the MCELEM and DECHIST stakinds we have to find the type's element type, and that has special logic for tsvectors, ranges, and other non-scalars, borrowing from the various *_typanalyze() functions. For that matter, the existing typanalyze functions don't grab the < operator, which I need for later data validations, so using examine_attribute() was simultaneously overkill and insufficient. None of this functionality is accessible from a client program, so we'd have to pull in a lot of backend stuff to pg_dump to make it resolve the typecasts correctly. Text and array_in() was just easier. > pg_set_relation_stats is simpler in that the set of stats values > to be set will probably remain fairly static, and there seems little > reason to allow
Re: remaining sql/json patches
FAILED: src/interfaces/ecpg/test/sql/sqljson_jsontable.c /home/jian/postgres/buildtest6/src/interfaces/ecpg/preproc/ecpg --regression -I../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql -I../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/include/ -o src/interfaces/ecpg/test/sql/sqljson_jsontable.c ../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc ../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc:21: WARNING: unsupported feature will be passed to server ../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc:32: ERROR: syntax error at or near ";" need an extra closing parenthesis? The rows produced by JSON_TABLE are laterally joined to the row that generated them, so you do not have to explicitly join the constructed view with the original table holding JSON - data. need closing para. SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 text PATH '$' error on empty error on error) EMPTY ON ERROR); should i expect it return one row? is there any example to make it return one row from top level "EMPTY ON ERROR"? + { + JsonTablePlan *scan = (JsonTablePlan *) plan; + + JsonTableInitPathScan(cxt, planstate, args, mcxt); + + planstate->nested = scan->child ? + JsonTableInitPlan(cxt, scan->child, planstate, args, mcxt) : NULL; + } first line seems strange, do we just simply change from "plan" to "scan"? + case JTC_REGULAR: + typenameTypeIdAndMod(pstate, rawc->typeName, , ); + + /* + * Use implicit FORMAT JSON for composite types (arrays and + * records) or if a non-default WRAPPER / QUOTES behavior is + * specified. + */ + if (typeIsComposite(typid) || + rawc->quotes != JS_QUOTES_UNSPEC || + rawc->wrapper != JSW_UNSPEC) + rawc->coltype = JTC_FORMATTED; per previous discussion, should we refactor the above comment? +/* Recursively set 'reset' flag of planstate and its child nodes */ +static void +JsonTablePlanReset(JsonTablePlanState *planstate) +{ + if (IsA(planstate->plan, JsonTableSiblingJoin)) + { + JsonTablePlanReset(planstate->left); + JsonTablePlanReset(planstate->right); + planstate->advanceRight = false; + } + else + { + planstate->reset = true; + planstate->advanceNested = false; + + if (planstate->nested) + JsonTablePlanReset(planstate->nested); + } per coverage, the first part of the IF branch never executed. i also found out that JsonTablePlanReset is quite similar to JsonTableRescan, i don't fully understand these two functions though. SELECT * FROM JSON_TABLE(jsonb'{"a": {"z":[]}, "b": 1,"c": 2, "d": 91}', '$' COLUMNS ( c int path '$.c', d int path '$.d', id1 for ordinality, NESTED PATH '$.a.z[*]' columns (z int path '$', id for ordinality) )); doc seems to say that duplicated ordinality columns in different nest levels are not allowed? "currentRow" naming seems misleading, generally, when we think of "row", we think of several (not one) datums, or several columns. but here, we only have one datum. I don't have good optional naming though. + case JTC_FORMATTED: + case JTC_EXISTS: + { + Node *je; + CaseTestExpr *param = makeNode(CaseTestExpr); + + param->collation = InvalidOid; + param->typeId = contextItemTypid; + param->typeMod = -1; + + je = transformJsonTableColumn(rawc, (Node *) param, + NIL, errorOnError); + + colexpr = transformExpr(pstate, je, EXPR_KIND_FROM_FUNCTION); + assign_expr_collations(pstate, colexpr); + + typid = exprType(colexpr); + typmod = exprTypmod(colexpr); + break; + } + + default: + elog(ERROR, "unknown JSON_TABLE column type: %d", rawc->coltype); + break; + } + + tf->coltypes = lappend_oid(tf->coltypes, typid); + tf->coltypmods = lappend_int(tf->coltypmods, typmod); + tf->colcollations = lappend_oid(tf->colcollations, get_typcollation(typid)); + tf->colvalexprs = lappend(tf->colvalexprs, colexpr); why not use exprCollation(colexpr) for tf->colcollations, similar to exprType(colexpr)? +-- Should fail (JSON arguments are not passed to column paths) +SELECT * +FROM JSON_TABLE( + jsonb '[1,2,3]', + '$[*] ? (@ < $x)' + PASSING 10 AS x + COLUMNS (y text FORMAT JSON PATH '$ ? (@ < $x)') + ) jt; +ERROR: could not find jsonpath variable "x" the error message does not correspond to the comments intention. also "y text FORMAT JSON" should be fine? only the second last example really using the PASSING clause. should the following query work just fine in this context? create table s(js jsonb); insert into s select '{"a":{"za":[{"z1": [11,]},{"z21": [22, 234,2345]}]},"c": 3}'; SELECT sub.* FROM s,JSON_TABLE(js, '$' passing 11 AS "b c", 1 + 2 as y COLUMNS (xx int path '$.c ? (@ == $y)')) sub; I thought the json and text data type were quite similar. should these following two queries return the same result? SELECT sub.* FROM s, JSON_TABLE(js, '$' COLUMNS( xx int path '$.c', nested PATH '$.a.za[1]' columns (NESTED PATH '$.z21[*]' COLUMNS (a12 jsonb path '$')) ))sub; SELECT sub.* FROM s,JSON_TABLE(js, '$'
Re: Catalog domain not-null constraints
On Tue, Mar 26, 2024 at 2:28 AM Dean Rasheed wrote: > > On Fri, 22 Mar 2024 at 08:28, jian he wrote: > > > > On Thu, Mar 21, 2024 at 7:23 PM Peter Eisentraut > > wrote: > > > > > > Hmm. CREATE DOMAIN uses column constraint syntax, but ALTER DOMAIN uses > > > table constraint syntax. Attached is a patch to try to sort this out. > > > > also you should also change src/backend/utils/adt/ruleutils.c? > > > > src6=# \dD > > List of domains > > Schema |Name | Type | Collation | Nullable | Default | > > Check > > +-+-+---+--+-+-- > > public | domain_test | integer | | not null | | > > CHECK (VALUE > 0) NOT NULL VALUE > > (1 row) > > > > probably change to CHECK (VALUE IS NOT NULL) > > I'd say it should just output "NOT NULL", since that's the input > syntax that created the constraint. But then again, why display NOT > NULL constraints in that column at all, when there's a separate > "Nullable" column? > create table sss(a int not null); SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'sss_a_not_null'; returns " NOT NULL a" I think just outputting "NOT NULL" is ok for the domain, given the table constraint is "NOT NULL" + table column, per above example. yech, we already have a "Nullable" column, so we don't need to display NOT NULL constraints.
Re: Statistics Import and Export
Corey Huinker writes: >> I can't quibble with that view of what has priority. I'm just >> suggesting that redesigning what pg_upgrade does in this area >> should come later than doing something about extended stats. > I mostly agree, with the caveat that pg_upgrade's existing message saying > that optimizer stats were not carried over wouldn't be 100% true anymore. I think we can tweak the message wording. I just don't want to be doing major redesign of the behavior, nor adding fundamentally new monitoring capabilities. regards, tom lane
Re: Psql meta-command conninfo+
>. However, > I didn't complete item 4. I'm not sure, but I believe that linking it to the > documentation > could confuse the user a bit. I chose to keep the descriptions as they were. > However, if > you have any ideas on how we could outline it, let me know and perhaps we can > implement it. That is fair, after spending some time thinking about this, it is better to make the documentation as crystal clear as possible. I do have some rewording suggestions for the following fields. Database: The database name of the connection. Authenticated User: The authenticated database user of the connection. Socket Directory: The Unix socket directory of the connection. NULL if host or hostaddr are specified. Host: The host name or address of the connection. NULL if a Unix socket is used. Server Port: The IP address of the connection. NULL if a Unix socket is used. Server Address: The IP address of the host name. NULL if a Unix socket is used. Client Address: The IP address of the client connected to this backend. NULL if a Unix socket is used. Client Port: The port of the client connected to this backend. NULL if a Unix socket is used. Backend PID: The process id of the backend for the connection. Application name: psql is the default for a psql connection unless otherwise specified in the configuration parameter. For System User, you should use the full definition here https://github.com/postgres/postgres/blob/master/doc/src/sgml/func.sgml#L24251-L24259. The current path is missing the full description. Regards, Sami
Re: Statistics Import and Export
> > I wonder if the right answer to that is "let's enhance the I/O > functions for those types". But whether that helps or not, it's > v18-or-later material for sure. > That was Stephen's take as well, and I agreed given that I had to throw the kitchen-sink of source-side oid mappings (attname, types, collatons, operators) into the JSON to work around the limitation. > I can't quibble with that view of what has priority. I'm just > suggesting that redesigning what pg_upgrade does in this area > should come later than doing something about extended stats. > I mostly agree, with the caveat that pg_upgrade's existing message saying that optimizer stats were not carried over wouldn't be 100% true anymore.
Re: Statistics Import and Export
Corey Huinker writes: > On Sun, Mar 31, 2024 at 2:41 PM Tom Lane wrote: >> There's a bigger issue though: AFAICS this patch set does nothing >> about dumping extended statistics. I surely don't want to hold up >> the patch insisting that that has to happen before we can commit the >> functionality proposed here. But we cannot rip out pg_upgrade's >> support for post-upgrade ANALYZE processing before we do something >> about extended statistics, and that means it's premature to be >> designing any changes to how that works. So I'd set that whole >> topic on the back burner. > So Extended Stats _were_ supported by earlier versions where the medium of > communication was JSON. However, there were several problems with adapting > that to the current model where we match params to stat types: > * Several of the column types do not have functional input functions, so we > must construct the data structure internally and pass them to > statext_store(). > * The output functions for some of those column types have lists of > attnums, with negative values representing positional expressions in the > stat definition. This information is not translatable to another system > without also passing along the attnum/attname mapping of the source system. I wonder if the right answer to that is "let's enhance the I/O functions for those types". But whether that helps or not, it's v18-or-later material for sure. > At least three people told me "nobody uses extended stats" and to just drop > that from the initial version. I can't quibble with that view of what has priority. I'm just suggesting that redesigning what pg_upgrade does in this area should come later than doing something about extended stats. regards, tom lane
Re: Statistics Import and Export
On Sun, Mar 31, 2024 at 2:41 PM Tom Lane wrote: > Corey Huinker writes: > > Having given this some thought, I'd be inclined to create a view, > > pg_stats_missing, with the same security barrier as pg_stats, but looking > > for tables that lack stats on at least one column, or lack stats on an > > extended statistics object. > > The week before feature freeze is no time to be designing something > like that, unless you've abandoned all hope of getting this into v17. > It was a response to the suggestion that there be some way for tools/automation to read the status of stats. I would view it as a separate patch, as such a view would be useful now for knowing which tables to ANALYZE, regardless of whether this patch goes in or not. > There's a bigger issue though: AFAICS this patch set does nothing > about dumping extended statistics. I surely don't want to hold up > the patch insisting that that has to happen before we can commit the > functionality proposed here. But we cannot rip out pg_upgrade's > support for post-upgrade ANALYZE processing before we do something > about extended statistics, and that means it's premature to be > designing any changes to how that works. So I'd set that whole > topic on the back burner. > So Extended Stats _were_ supported by earlier versions where the medium of communication was JSON. However, there were several problems with adapting that to the current model where we match params to stat types: * Several of the column types do not have functional input functions, so we must construct the data structure internally and pass them to statext_store(). * The output functions for some of those column types have lists of attnums, with negative values representing positional expressions in the stat definition. This information is not translatable to another system without also passing along the attnum/attname mapping of the source system. At least three people told me "nobody uses extended stats" and to just drop that from the initial version. Unhappy with this assessment, I inquired as to whether my employer (AWS) had some internal databases that used extended stats so that I could get good test data, and came up with nothing, nor did anyone know of customers who used the feature. So when the fourth person told me that nobody uses extended stats, and not to let a rarely-used feature get in the way of a feature that would benefit nearly 100% of users, I dropped it. > It's possible that we could drop the analyze-in-stages recommendation, > figuring that this functionality will get people to the > able-to-limp-along level immediately and that all that is needed is a > single mop-up ANALYZE pass. But I think we should leave that till we > have a bit more than zero field experience with this feature. It may be that we leave the recommendation exactly as it is. Perhaps we enhance the error messages in pg_set_*_stats() to indicate what command would remediate the issue.
RE: Psql meta-command conninfo+
Note that, in the patch as posted, the column names are not translatable. In order to be translatable, the code needs to do something like appendPQExpBuffer(, " NULL AS \"%s\",\n" " NULL AS \"%s\",\n" " NULL AS \"%s\",\n" " NULL AS \"%s\",\n", _("SSL Connection"), _("SSL Protocol"), _("SSL Cipher"), _("SSL Compression")); instead of appendPQExpBuffer(, " NULL AS \"SSL Connection\",\n" " NULL AS \"SSL Protocol\",\n" " NULL AS \"SSL Cipher\",\n" " NULL AS \"SSL Compression\",\n"); Please list me as reviewer for this patch, as I provided significant guidance before it was even posted. -//- Hello! (v23) Yes Álvaro, I have already appointed you as the patch reviewer. It's true, even before publishing it on Commifest, you have already provided good ideas and guidance. I adjusted the translation syntax for the SSL and GSSAPI columns. After your validation, that is, an okay confirmation that it's fine, I'll proceed with the others. Test: [postgres@localhost bin]$ ./psql -x -p 5000 -h 127.0.0.1 psql (17devel) Type "help" for help. postgres=# \conninfo You are connected to database "postgres" as user "postgres" on host "127.0.0.1" at port "5000". postgres=# \conninfo+ Current Connection Information -[ RECORD 1 ]+-- Database | postgres Authenticated User | postgres Socket Directory | Host | 127.0.0.1 Server Port | 5000 Server Address | 127.0.0.1 Client Address | 127.0.0.1 Client Port | 52966 Backend PID | 1693 System User | Current User | postgres Session User | postgres Application Name | psql SSL Connection | f SSL Protocol | SSL Cipher | SSL Compression | GSSAPI Authenticated | f GSSAPI Principal | GSSAPI Encrypted | f GSSAPI Credentials Delegated | f Regards, Maiquel Grassi. v23-0001-psql-meta-command-conninfo-plus.patch Description: v23-0001-psql-meta-command-conninfo-plus.patch
Re: Security lessons from liblzma
Hi, On 2024-03-31 12:18:29 +0200, Michael Banck wrote: > If you ask where they are maintained, the answer is here: > > https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads > > the other major versions have their own branch. Luckily these are all quite small, leaving little space to hide stuff. I'd still like to get rid of at least some of them. I've previously proposed a patch to make pkglibdir configurable, I think we should just go for that. For the various defines, ISTM we should just make them #ifndef guarded, then they could be overridden by defining them at configure time. Some of them, like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every distro. And others would be nice to easily override anyway, I e.g. dislike the default DEFAULT_PAGER value. On 2024-03-31 16:55:26 +0100, Devrim Gündüz wrote: > Here are the v17 patches: > > https://git.postgresql.org/gitweb/?p=pgrpms.git;a=tree;f=rpm/redhat/main/non-common/postgresql-17/main A bit bigger/more patches, but still not too bad. postgresql-17-conf.patch Uncomments a few values to their default, that's a bit odd. postgresql-17-ecpg_config.h: postgresql-17-pg_config.h: Hm, wonder if we should make this easier somehow. Perhaps we ought to support installing the various *config.h headers into a different directory than the architecture independent stuff? At least on debian based systems it seems we ought to support installing pg_config.h etc into /usr/include/ or something along those lines. postgresql-17-rpm-pgsql.patch: We should probably make this stuff properly configurable. The current logic with inferring whether to add /postgresql is just weird. Perhaps a configure option that defaults to the current logic when set to an empty string but can be overridden? Greetings, Andres Freund
Re: Security lessons from liblzma
Michael Banck writes: > On Sun, Mar 31, 2024 at 01:05:40PM -0400, Joe Conway wrote: >> But it has always bothered me how many patches get applied to the upstream >> tarballs by the OS package builders. > I think this more an artifact of how RHEL development works, i.e. trying > to ship the same major version of glibc for 10 years, but still fix lots > of bugs and possibly some performance improvements your larger customers > ask for. So I guess a lot of those 1000 patches are just cherry-picks / > backports of upstream commits from newer releases. Yeah. Also, precisely because they keep supporting versions that are out-of-support according to upstream, the idea that all the patches can be moved upstream isn't going to work for them, and they're unlikely to be excited about partial solutions. The bigger problem though is: if we do this, are we going to take patches that we fundamentally don't agree with? For instance, if a packager chooses to rip out the don't-run-server-as-root check. (Pretty sure I've heard of people doing that.) That would look like blessing things we don't think are good ideas, and it would inevitably lead to long arguments with packagers about why-dont-you-do-this-some- other-way. I'm not excited about that prospect. regards, tom lane
Re: Add column name to error description
Erik Wienhold writes: > On 2024-03-31 15:22 +0200, Marcos Pegoraro wrote: >> This is my first patch, so sorry if I miss something. > Please make sure that tests are passing by running make check: check-world, in fact. > The format "%d-%s" is not ideal. I suggesst "%d (%s)". I didn't like that either, for two reasons: if we have a column name it ought to be the prominent label, but we might not have one if the TupleDesc came from some anonymous source (possibly that case explains the test crash? Although I think the attname would be an empty string rather than missing entirely in such cases). I think it'd be worth providing two distinct message strings: "Returned type %s does not match expected type %s in column \"%s\" (position %d)." "Returned type %s does not match expected type %s in column position %d." I'd suggest dropping the column number entirely in the first case, were it not that the attnames might well not be unique if we're dealing with an anonymous record type such as a SELECT result. regards, tom lane
Re: Security lessons from liblzma
Hi, On Sun, Mar 31, 2024 at 01:05:40PM -0400, Joe Conway wrote: > But it has always bothered me how many patches get applied to the upstream > tarballs by the OS package builders. Some of them, e.g. glibc on RHEL 7, > include more than 1000 patches that you would have to manually vet if you > cared enough and had the skills. Last time I looked at the openssl package > sources it was similar in volume and complexity. They might as well be > called forks if everyone were being honest about it... I think this more an artifact of how RHEL development works, i.e. trying to ship the same major version of glibc for 10 years, but still fix lots of bugs and possibly some performance improvements your larger customers ask for. So I guess a lot of those 1000 patches are just cherry-picks / backports of upstream commits from newer releases. I guess it would be useful to maybe have another look at the patches that are being applied for apt/yum.postgresql.org for the 18 release cycle, but I do not think those are a security problem. Not sure about RPM builds, but at least in theory the APT builds should be reproducible. What would be a significant gain in security/trust was an easy service/recipe on how to verify the reproducibility (i) by independently building packages (and maybe the more popular extensions) and comparing them to the {apt,yum}.postgresql.org repository packages (ii) by being able to build the release tarballs reproducibly. Michael
Re: Add column name to error description
On 2024-03-31 15:22 +0200, Marcos Pegoraro wrote: > This is my first patch, so sorry if I miss something. Please make sure that tests are passing by running make check: https://www.postgresql.org/docs/current/regress-run.html#REGRESS-RUN-TEMP-INST The patch breaks src/test/regress/sql/plpgsql.sql at: -- this does not work currently (no implicit casting) create or replace function compos() returns compostype as $$ begin return (1, 'hello'); end; $$ language plpgsql; select compos(); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost > If I have a function which returns lots of columns and any of these columns > returns a wrong type it's not easy to see which one is that column because > it points me only to its position, not its name. So, this patch only adds > that column name, just that. +1 for this improvement. > create function my_f(a integer, b integer) returns table(first_col integer, > lots_of_cols_later numeric) language plpgsql as $function$ > begin > return query select a, b; > end;$function$; > > select * from my_f(1,1); > --ERROR: structure of query does not match function result type > --Returned type integer does not match expected type numeric in column 2. > > For a function which has just 2 columns is easy but if it returns a hundred > of columns, which one is that 66th column ? > > My patch just adds column name to that description message. > --ERROR: structure of query does not match function result type > --Returned type integer does not match expected type numeric in column 2- > lots_of_cols_later. > diff --git a/src/backend/access/common/attmap.c > b/src/backend/access/common/attmap.c > index b0fe27ef57..85f7c0cb8c 100644 > --- a/src/backend/access/common/attmap.c > +++ b/src/backend/access/common/attmap.c > @@ -118,12 +118,13 @@ build_attrmap_by_position(TupleDesc indesc, > ereport(ERROR, > (errcode(ERRCODE_DATATYPE_MISMATCH), > errmsg_internal("%s", _(msg)), > - errdetail("Returned type %s does not match expected > type %s in column %d.", > + errdetail("Returned type %s does not match expected > type %s in column %d-%s.", The format "%d-%s" is not ideal. I suggesst "%d (%s)". > format_type_with_typemod(att->atttypid, > att->atttypmod), > format_type_with_typemod(atttypid, > atttypmod), > - noutcols))); > + noutcols, > + att->attname))); Must be NameStr(att->attname) for use with printf's %s. GCC even prints a warning: attmap.c:121:60: warning: format ‘%s’ expects argument of type ‘char *’, but argument 5 has type ‘NameData’ {aka ‘struct nameData’} [-Wformat=] -- Erik
Re: Statistics Import and Export
My apologies for having paid so little attention to this thread for months. I got around to reading the v15 patches today, and while I think they're going in more or less the right direction, there's a long way to go IMO. I concur with the plan of extracting data from pg_stats not pg_statistics, and with emitting a single "set statistics" call per attribute. (I think at one point I'd suggested a call per stakind slot, but that would lead to a bunch of UPDATEs on existing pg_attribute tuples and hence a bunch of dead tuples at the end of an import, so it's not the way to go. A series of UPDATEs would likely also play poorly with a background auto-ANALYZE happening concurrently.) I do not like the current design for pg_set_attribute_stats' API though: I don't think it's at all future-proof. What happens when somebody adds a new stakind (and hence new pg_stats column)? You could try to add an overloaded pg_set_attribute_stats version with more parameters, but I'm pretty sure that would lead to "ambiguous function call" failures when trying to load old dump files containing only the original parameters. The present design is also fragile in that an unrecognized parameter will lead to a parse-time failure and no function call happening, which is less robust than I'd like. As lesser points, the relation argument ought to be declared regclass not oid for convenience of use, and I really think that we need to provide the source server's major version number --- maybe we will never need that, but if we do and we don't have it we will be sad. So this leads me to suggest that we'd be best off with a VARIADIC ANY signature, where the variadic part consists of alternating parameter labels and values: pg_set_attribute_stats(table regclass, attribute name, inherited bool, source_version int, variadic "any") returns void where a call might look like SELECT pg_set_attribute_stats('public.mytable', 'mycolumn', false, -- not inherited 16, -- source server major version -- pairs of labels and values follow 'null_frac', 0.4, 'avg_width', 42, 'histogram_bounds', array['a', 'b', 'c']::text[], ...); Note a couple of useful things here: * AFAICS we could label the function strict and remove all those ad-hoc null checks. If you don't have a value for a particular stat, you just leave that pair of arguments out (exactly as the existing code in 0002 does, just using a different notation). This also means that we don't need any default arguments and so no need for hackery in system_functions.sql. * If we don't recognize a parameter label at runtime, we can treat that as a warning rather than a hard error, and press on. This case would mostly be useful in major version downgrades I suppose, but that will be something people will want eventually. * We can require the calling statement to cast arguments, particularly arrays, to the proper type, removing the need for conversions within the stats-setting function. (But instead, it'd need to check that the next "any" argument is the type it ought to be based on the type of the target column.) If we write the labels as undecorated string literals as I show above, I think they will arrive at the function as "unknown"-type constants, which is a little weird but doesn't seem like it's really a big problem. The alternative is to cast them all to text explicitly, but that's adding notation to no great benefit IMO. pg_set_relation_stats is simpler in that the set of stats values to be set will probably remain fairly static, and there seems little reason to allow only part of them to be supplied (so personally I'd drop the business about accepting nulls there too). If we do grow another value or values for it to set there shouldn't be much problem with overloading it with another version with more arguments. Still needs to take regclass not oid though ... I've not read the patches in great detail, but I did note a few low-level issues: * why is check_relation_permissions looking up the pg_class row? There's already a copy of that in the Relation struct. Likewise for the other caller of can_modify_relation (but why is that caller not using check_relation_permissions?) That all looks overly complicated and duplicative. I think you don't need two layers of function there. * I find the stuff with enums and "const char *param_names" to be way too cute and unlike anything we do elsewhere. Please don't invent your own notations for coding patterns that have hundreds of existing instances. pg_set_relation_stats, for example, has absolutely no reason not to look like the usual Oid relid = PG_GETARG_OID(0); float4 relpages = PG_GETARG_FLOAT4(1); ... etc ... * The
Re: Statistics Import and Export
Corey Huinker writes: > Having given this some thought, I'd be inclined to create a view, > pg_stats_missing, with the same security barrier as pg_stats, but looking > for tables that lack stats on at least one column, or lack stats on an > extended statistics object. The week before feature freeze is no time to be designing something like that, unless you've abandoned all hope of getting this into v17. There's a bigger issue though: AFAICS this patch set does nothing about dumping extended statistics. I surely don't want to hold up the patch insisting that that has to happen before we can commit the functionality proposed here. But we cannot rip out pg_upgrade's support for post-upgrade ANALYZE processing before we do something about extended statistics, and that means it's premature to be designing any changes to how that works. So I'd set that whole topic on the back burner. It's possible that we could drop the analyze-in-stages recommendation, figuring that this functionality will get people to the able-to-limp-along level immediately and that all that is needed is a single mop-up ANALYZE pass. But I think we should leave that till we have a bit more than zero field experience with this feature. regards, tom lane
Re: Security lessons from liblzma
On 3/31/24 11:49, Tom Lane wrote: Joe Conway writes: I am saying maybe those patches should be eliminated in favor of our tree including build options that would produce the same result. I don't really see how that can be expected to work sanely. It turns the responsibility for platform-specific build issues on its head, and it doesn't work at all for issues discovered after we make a release. The normal understanding of how you can vet a distro's package is that you look at the package contents (the SRPM in Red Hat world and whatever the equivalent concept is elsewhere), check that the contained tarball matches upstream and that the patches and build instructions look sane, and then build it locally and check for a match to the distro's binary package. Even if we could overcome the obstacles to putting the patch files into the upstream tarball, we're surely not going to include the build instructions, so we'd not have moved the needle very far in terms of whether the packager could do something malicious. True enough I guess. But it has always bothered me how many patches get applied to the upstream tarballs by the OS package builders. Some of them, e.g. glibc on RHEL 7, include more than 1000 patches that you would have to manually vet if you cared enough and had the skills. Last time I looked at the openssl package sources it was similar in volume and complexity. They might as well be called forks if everyone were being honest about it... I know our PGDG packages are no big deal compared to that, fortunately. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Security lessons from liblzma
Hi, On Sat, 2024-03-30 at 21:52 -0400, Bruce Momjian wrote: > How would someone access the out-of-tree patches? Here are the v17 patches: https://git.postgresql.org/gitweb/?p=pgrpms.git;a=tree;f=rpm/redhat/main/non-common/postgresql-17/main You can replace -17 with -16 (and etc) for the other major releases. Please note that both Debian folks and me build about 300 other packages to support the ecosystem. Just saying. Regards, -- Devrim Gündüz Open Source Solution Architect, PostgreSQL Major Contributor Twitter: @DevrimGunduz , @DevrimGunduzTR
Re: Security lessons from liblzma
Joe Conway writes: > I am saying maybe those patches should be eliminated in favor of our > tree including build options that would produce the same result. I don't really see how that can be expected to work sanely. It turns the responsibility for platform-specific build issues on its head, and it doesn't work at all for issues discovered after we make a release. The normal understanding of how you can vet a distro's package is that you look at the package contents (the SRPM in Red Hat world and whatever the equivalent concept is elsewhere), check that the contained tarball matches upstream and that the patches and build instructions look sane, and then build it locally and check for a match to the distro's binary package. Even if we could overcome the obstacles to putting the patch files into the upstream tarball, we're surely not going to include the build instructions, so we'd not have moved the needle very far in terms of whether the packager could do something malicious. regards, tom lane
Re: New Table Access Methods for Multi and Single Inserts
On Wed, Mar 27, 2024 at 1:42 PM Jeff Davis wrote: > > On Wed, 2024-03-27 at 01:19 +0530, Bharath Rupireddy wrote: > > > > Similarly, with this new AM, the onus lies on the table AM > > implementers to provide an implementation for these new AMs even if > > they just do single inserts. > > Why not fall back to using the plain tuple_insert? Surely some table > AMs might be simple and limited, and we shouldn't break them just > because they don't implement the new APIs. Hm. That might complicate table_modify_begin, table_modify_buffer_insert and table_modify_end a bit. What do we put in TableModifyState then? Do we create the bulk insert state (BulkInsertStateData) outside? I think to give a better interface, can we let TAM implementers support these new APIs in their own way? If this sounds rather intrusive, we can just implement the fallback to tuple_insert if these new API are not supported in the caller, for example, do something like below in createas.c and matview.c. Thoughts? if (table_modify_buffer_insert() is defined) table_modify_buffer_insert(...); else { myState->bistate = GetBulkInsertState(); table_tuple_insert(...); } > > table_multi_insert needs to be there for sure as COPY ... FROM uses > > it. > > After we have these new APIs fully in place and used by COPY, what will > happen to those other APIs? Will they be deprecated or will there be a > reason to keep them? Deprecated perhaps? Please find the attached v16 patches for further review. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 85410b429917cf388c4b58883ddc304118c73143 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sun, 31 Mar 2024 15:34:16 + Subject: [PATCH v16 1/2] Introduce new table modify access methods --- src/backend/access/heap/heapam.c | 189 ++- src/backend/access/heap/heapam_handler.c | 5 + src/backend/access/table/tableamapi.c| 4 + src/include/access/heapam.h | 41 + src/include/access/tableam.h | 58 +++ src/tools/pgindent/typedefs.list | 3 + 6 files changed, 299 insertions(+), 1 deletion(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index b661d9811e..69f8c597d8 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -64,6 +64,7 @@ #include "storage/standby.h" #include "utils/datum.h" #include "utils/inval.h" +#include "utils/memutils.h" #include "utils/relcache.h" #include "utils/snapmgr.h" #include "utils/spccache.h" @@ -107,7 +108,8 @@ static int bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate); static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup); static HeapTuple ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_required, bool *copy); - +static void heap_modify_buffer_flush(TableModifyState *state); +static void heap_modify_insert_end(TableModifyState *state); /* * Each tuple lock mode has a corresponding heavyweight lock, and one or two @@ -2441,6 +2443,191 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, *insert_indexes = true; } +/* + * Initialize heap modify state. + */ +TableModifyState * +heap_modify_begin(Relation rel, int modify_flags, CommandId cid, + int options) +{ + TableModifyState *state; + MemoryContext context; + MemoryContext oldcontext; + + context = AllocSetContextCreate(CurrentMemoryContext, + "heap_modify memory context", + ALLOCSET_DEFAULT_SIZES); + + oldcontext = MemoryContextSwitchTo(context); + state = palloc0(sizeof(TableModifyState)); + state->rel = rel; + state->modify_flags = modify_flags; + state->mctx = context; + state->cid = cid; + state->options = options; + state->insert_indexes = false; + state->modify_end_cb = NULL; /* To be installed lazily */ + MemoryContextSwitchTo(oldcontext); + + return state; +} + +/* + * Store passed-in tuple into in-memory buffered slots. When full, insert + * multiple tuples from the buffers into heap. + */ +void +heap_modify_buffer_insert(TableModifyState *state, + TupleTableSlot *slot) +{ + TupleTableSlot *dstslot; + HeapInsertState *istate; + HeapMultiInsertState *mistate; + MemoryContext oldcontext; + + oldcontext = MemoryContextSwitchTo(state->mctx); + + /* First time through, initialize heap insert state */ + if (state->data == NULL) + { + istate = (HeapInsertState *) palloc0(sizeof(HeapInsertState)); + istate->bistate = NULL; + istate->mistate = NULL; + state->data = istate; + + if ((state->modify_flags & TM_FLAG_MULTI_INSERTS) != 0) + { + mistate = (HeapMultiInsertState *) palloc0(sizeof(HeapMultiInsertState)); + mistate->slots = (TupleTableSlot **) palloc0(sizeof(TupleTableSlot *) * HEAP_MAX_BUFFERED_SLOTS); + istate->mistate = mistate; + } + + if ((state->modify_flags & TM_FLAG_BAS_BULKWRITE) != 0) + istate->bistate = GetBulkInsertState(); + +
Re: Security lessons from liblzma
Hi, On Sun, 2024-03-31 at 08:15 -0400, Joe Conway wrote: > > I am saying maybe those patches should be eliminated in favor of our > tree including build options that would produce the same result. Works for me, as a long as I can commit them and upcoming potential patches to PostgreSQL git repo. OTOH, we also carry non-patches like README files, systemd unit files, pam files, setup script, etc., which are very RPM specific. Regards, -- Devrim Gündüz Open Source Solution Architect, PostgreSQL Major Contributor Twitter: @DevrimGunduz , @DevrimGunduzTR
Re: Building with musl in CI and the build farm
About building one of the CI tasks with musl: Andres Freund: I'd rather adapt one of the existing tasks, to avoid increasing CI costs unduly. I looked into this and I think the only task that could be changed is the SanityCheck. This is because this builds without any additional features enabled. I guess that makes sense, because otherwise those dependencies would first have to be built with musl-gcc as well. FWIW, except for one small issue, building postgres against musl works on debian and the tests pass if I install first. After the fix for LD_LIBRARY_PATH this now works as expected without installing first. I confirmed it works on debian with CC=musl-gcc. The small problem mentioned above is that on debian linux/fs.h isn't available when building with musl, which in turn causes src/bin/pg_upgrade/file.c to fail to compile. According to [1], this can be worked around by linking some folders: ln -s /usr/include/linux /usr/include/x86_64-linux-musl/ ln -s /usr/include/asm-generic /usr/include/x86_64-linux-musl/ ln -s /usr/include/x86_64-linux-gnu/asm /usr/include/x86_64-linux-musl/ Please find a patch to use musl-gcc in SanityCheck attached. Logs from the CI run are in [2]. It has this in the configure phase: [13:19:52.712] Using 'CC' from environment with value: 'ccache musl-gcc' [13:19:52.712] C compiler for the host machine: ccache musl-gcc (gcc 10.2.1 "cc (Debian 10.2.1-6) 10.2.1 20210110") [13:19:52.712] C linker for the host machine: musl-gcc ld.bfd 2.35.2 [13:19:52.712] Using 'CC' from environment with value: 'ccache musl-gcc' So meson picks up musl-gcc properly. I also double checked that without the links above, the build does indeed fail with the linux/fs.h error. I assume the installation of musl-tools should be done in the pg-vm-images repo instead of the additional script here? Best, Wolfgang [1]: https://debian-bugs-dist.debian.narkive.com/VlFkLigg/bug-789789-musl-fails-to-compile-stuff-that-depends-on-kernel-headers [2]: https://cirrus-ci.com/task/5741892590108672From 4a69d9851e7bbd7cd521d236847af9ebf5e6253b Mon Sep 17 00:00:00 2001 From: Wolfgang Walther Date: Sun, 31 Mar 2024 15:17:43 +0200 Subject: [PATCH] Build SanityCheck against musl --- .cirrus.tasks.yml | 9 + 1 file changed, 9 insertions(+) diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index 72f553e69f4..5815c51abbe 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -69,6 +69,7 @@ task: CCACHE_DIR: ${CIRRUS_WORKING_DIR}/ccache_dir # no options enabled, should be small CCACHE_MAXSIZE: "150M" +CC: ccache musl-gcc # While containers would start up a bit quicker, building is a bit # slower. This way we don't have to maintain a container image. @@ -77,6 +78,14 @@ task: ccache_cache: folder: $CCACHE_DIR + setup_additional_packages_script: | +apt-get update +DEBIAN_FRONTEND=noninteractive apt-get -y install musl-tools +ln -s -t /usr/include/x86_64-linux-musl/ \ + /usr/include/linux \ + /usr/include/asm-generic \ + /usr/include/x86_64-linux-gnu/asm + create_user_script: | useradd -m postgres chown -R postgres:postgres . -- 2.44.0
Re: pgbnech: allow to cancel queries during benchmark
On Sat, 30 Mar 2024 14:35:37 +0100 Alvaro Herrera wrote: > Due to commit 61461a300c1c, this patch needs to be reworked. Thank you for pointing out this. Although cfbot doesn't report any failure, but PQcancel is now deprecated and insecure. I'll consider it too while fixing a problem I found in [1]. [1] https://www.postgresql.org/message-id/20240207101903.b5846c25808f64a91ee2e7a2%40sraoss.co.jp Regards, Yugo Nagata > -- > Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ > "La fuerza no está en los medios físicos > sino que reside en una voluntad indomable" (Gandhi) -- Yugo NAGATA
Add column name to error description
This is my first patch, so sorry if I miss something. If I have a function which returns lots of columns and any of these columns returns a wrong type it's not easy to see which one is that column because it points me only to its position, not its name. So, this patch only adds that column name, just that. create function my_f(a integer, b integer) returns table(first_col integer, lots_of_cols_later numeric) language plpgsql as $function$ begin return query select a, b; end;$function$; select * from my_f(1,1); --ERROR: structure of query does not match function result type --Returned type integer does not match expected type numeric in column 2. For a function which has just 2 columns is easy but if it returns a hundred of columns, which one is that 66th column ? My patch just adds column name to that description message. --ERROR: structure of query does not match function result type --Returned type integer does not match expected type numeric in column 2- lots_of_cols_later. regards Marcos Add column name to error description.patch Description: Binary data
Re: Security lessons from liblzma
On 3/30/24 21:52, Bruce Momjian wrote: On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote: Virtually every RPM source, including ours, contains out of tree patches that get applied on top of the release tarball. At least for the PGDG packages, it would be nice to integrate them into our git repo as build options or whatever so that the packages could be built without any patches applied to it. Add a tarball that is signed and traceable back to the git tag, and we would be in a much better place than we are now. How would someone access the out-of-tree patches? I think Debian includes the patches in its source tarball. I am saying maybe those patches should be eliminated in favor of our tree including build options that would produce the same result. For example, these patches are applied to our release tarball files when the RPM is being built for pg16 on RHEL 9: - https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-rpm-pgsql.patch;h=d9b6d12b7517407ac81352fa325ec91b05587641;hb=HEAD https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-var-run-socket.patch;h=f2528efaf8f4681754b20283463eff3e14eedd39;hb=HEAD https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-conf.patch;h=da28ed793232316dd81fdcbbe59a6479b054a364;hb=HEAD https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16-perl-rpath.patch;h=748c42f0ec2c9730af3143e90e5b205c136f40d9;hb=HEAD - Nothing too crazy, but wouldn't it be better if no patches were required at all? Ideally we should have reproducible builds so that starting with our tarball (which is traceable back to the git release tag) one can easily obtain the same binary as what the RPMs/DEBs deliver. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Statistics Import and Export
> > That will make it *really* hard for any form of automation or drivers of > this. The information needs to go somewhere where such tools can easily > consume it, and an informational message during runtime (which is also > likely to be translated in many environments) is the exact opposite of that. > Having given this some thought, I'd be inclined to create a view, pg_stats_missing, with the same security barrier as pg_stats, but looking for tables that lack stats on at least one column, or lack stats on an extended statistics object. Table structure would be schemaname name tablename name attnames text[] ext_stats text[] The informational message, if it changes at all, could reference this new view as the place to learn about how well the stats import went. vacuumdb might get a --missing-only option in addition to --analyze-in-stages. For that matter, we could add --analyze-missing options to pg_restore and pg_upgrade to do the mopping up themselves.
Re: ALTER TABLE SET ACCESS METHOD on partitioned tables
On Sun, Mar 31, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote: > Hello Alvaro, > > 28.03.2024 18:58, Alvaro Herrera wrote: > > Grumble. I don't like initialization at declare time, so far from the > > code that depends on the value. But the alternative would have been to > > assign right where this blocks starts, an additional line. I pushed it > > like you had it. > > I've stumbled upon a test failure caused by the test query added in that > commit: > +ERROR: deadlock detected > +DETAIL: Process 3076180 waits for AccessShareLock on relation 1259 of > database 16386; blocked by process 3076181. > +Process 3076181 waits for AccessShareLock on relation 2601 of database > 16386; blocked by process 3076180. I think means that, although it was cute to use pg_am in the reproducer given in the problem report, it's not a good choice to use here in the sql regression tests. -- Justin
Re: Security lessons from liblzma
On Sat, Mar 30, 2024 at 09:52:47PM -0400, Bruce Momjian wrote: > On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote: > > Virtually every RPM source, including ours, contains out of tree patches > > that get applied on top of the release tarball. At least for the PGDG > > packages, it would be nice to integrate them into our git repo as build > > options or whatever so that the packages could be built without any patches > > applied to it. Add a tarball that is signed and traceable back to the git > > tag, and we would be in a much better place than we are now. > > How would someone access the out-of-tree patches? I think Debian > includes the patches in its source tarball. If you ask where they are maintained, the answer is here: https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads the other major versions have their own branch. Michael
Re: [PATCH] kNN for btree
> On 15 Jan 2024, at 13:11, Anton A. Melnikov wrote: > > If there are any ideas pro and contra would be glad to discuss them. Hi, Anton! This is kind of ancient thread. I've marked CF entry [0] as "Needs review" and you as an author (seems like you are going to be a point of correspondence on this feature). At this point it's obvious that the feature won't make it to 17, so let's move to July CF. Given 7 years of history in this thread, you might want to start a new one with a summarisation of this thread. This will attract more reviewers, either way they have to dig on their own. Thanks! Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4871/
Re: logicalrep_worker_launch -- counting/checking the worker limits
> On 15 Aug 2023, at 07:38, Peter Smith wrote: > > A rebase was needed due to a recent push [1]. > > PSA v3. > On 14 Jan 2024, at 10:43, vignesh C wrote: > > I have changed the status of the patch to "Waiting on Author" as > Amit's queries at [1] have not been verified and concluded. Please > feel free to address them and change the status back again. Hi Peter! Are you still interested in this thread? If so - please post an answer to Amit's question. If you are not interested - please Withdraw a CF entry [0]. Thanks! Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4499/
Re: Wrong results with grouping sets
> On 11 Jan 2024, at 20:10, vignesh C wrote: > > I have changed the status of the patch to "Waiting on Author" as Tom > Lane's comments have not yet been addressed, feel free to address them > and update the commitfest entry accordingly. This CF entry seems to be a fix for actually unexpected behaviour. But seems like we need another fix. Richard, Alena, what do you think? Should we mark CF entry [0] "RwF" or leave it to wait for better fix? Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4583/
Re: ALTER TABLE SET ACCESS METHOD on partitioned tables
Hello Alvaro, 28.03.2024 18:58, Alvaro Herrera wrote: Grumble. I don't like initialization at declare time, so far from the code that depends on the value. But the alternative would have been to assign right where this blocks starts, an additional line. I pushed it like you had it. I've stumbled upon a test failure caused by the test query added in that commit: --- .../src/test/regress/expected/create_am.out 2024-03-28 12:14:11.700764888 -0400 +++ .../src/test/recovery/tmp_check/results/create_am.out 2024-03-31 03:10:28.172244122 -0400 @@ -549,7 +549,10 @@ ERROR: access method "btree" is not of type TABLE -- Other weird invalid cases that cause problems CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x; -ERROR: "pg_am" is not partitioned +ERROR: deadlock detected +DETAIL: Process 3076180 waits for AccessShareLock on relation 1259 of database 16386; blocked by process 3076181. +Process 3076181 waits for AccessShareLock on relation 2601 of database 16386; blocked by process 3076180. +HINT: See server log for query details. -- Drop table access method, which fails as objects depends on it DROP ACCESS METHOD heap2; ERROR: cannot drop access method heap2 because other objects depend on it 027_stream_regress_primary.log contains: 2024-03-31 03:10:26.728 EDT [3076181] pg_regress/vacuum LOG: statement: VACUUM FULL pg_class; ... 2024-03-31 03:10:26.797 EDT [3076180] pg_regress/create_am LOG: statement: CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x; ... 2024-03-31 03:10:28.183 EDT [3076181] pg_regress/vacuum LOG: statement: VACUUM FULL pg_database; This simple demo confirms the issue: for ((i=1;i<=20;i++)); do echo "iteration $i" echo "VACUUM FULL pg_class;" | psql >psql-1.log & echo "CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;" | psql >psql-2.log & wait done ... iteration 15 ERROR: "pg_am" is not partitioned iteration 16 ERROR: deadlock detected DETAIL: Process 2556377 waits for AccessShareLock on relation 1259 of database 16384; blocked by process 2556378. Process 2556378 waits for AccessShareLock on relation 2601 of database 16384; blocked by process 2556377. HINT: See server log for query details. ... Best regards, Alexander