Re: Remove MSVC scripts from the tree
On 20.12.23 02:14, Michael Paquier wrote: Hmm. Interesting. So this basically comes down to the fact that GZIP and TAR are required in ./configure because distcheck has a hard dependency on both, but we don't support this target in meson. Is that right? No, the issue is that gzip and tar are not required by configure (it will proceed if they are not found), but they are currently required by meson.build (it will error if they are not found). They are used in two different areas. One is for "make dist", but that doesn't affect meson anyway. The other is various test suites. The test suites are already set up to skip tests when gzip and tar are not found.
Re: Remove MSVC scripts from the tree
On 19.12.23 17:44, Nazir Bilal Yavuz wrote: I think we need to require sed when dtrace or selinux is found, not by looking at the return value of the get_option().enabled(). Right. I think the correct condition would be sed = find_program(get_option('SED'), 'sed', native: true, required: dtrace.found() or selinux.found()) I was trying to avoid that because it would require moving the find_program() to somewhere later in the top-level meson.build, but I suppose we have to do it that way.
Re: Function to get invalidation cause of a replication slot.
Hi, On 12/20/23 7:01 AM, shveta malik wrote: Hello hackers, Attached is a patch which attempts to implement a new system function pg_get_slot_invalidation_cause('slot_name') to get invalidation cause of a replication slot. Thanks! +1 for the idea to display this information through an SQL API. I wonder if we could add a new field to pg_replication_slots instead of creating a new function. One of the use case scenarios for this function is another ongoing thread "Synchronizing slots from primary to standby" at [1]. This function is needed there to replicate invalidation-cause of a logical replication slot from the primary server to the hot standby. But this is an independent requirement in itself and thus makes sense to have it implemented separately. Agree. My thoughts about it: 1 === "Currently, users do not have a way to know the invalidation cause" I'm not sure about it, I think one could see the reasons in the log file. 2 === "This function returns NULL if the replication slot is not found" Why not returning an error instead? (like pg_drop_replication_slot() does for example) FWIW, we'd not need to cover this case if the description would be added to pg_replication_slots. 3 === + + 3 = wal_level insufficient for the slot + wal_level insufficient on the primary maybe? 4 === + * Returns ReplicationSlotInvalidationCause enum value for valid slot_name; I think it would be more user friendly to return a description instead of an enum value. 5 === doc/src/sgml/func.sgml | 33 + src/backend/replication/slotfuncs.c | 27 +++ src/include/catalog/pg_proc.dat | 4 Worth to add some coverage in 019_replslot_limit.pl and 035_standby_logical_decoding.pl? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Add new for_each macros for iterating over a List that do not require ListCell pointer
On Tue, 19 Dec 2023 at 21:22, Nathan Bossart wrote: > > On Tue, Dec 19, 2023 at 03:44:43PM +0100, Jelte Fennema-Nio wrote: > > On Tue, 19 Dec 2023 at 11:59, vignesh C wrote: > >> I noticed that this change can be done in several other places too. > > > > My guess would be that ~90% of all existing foreach loops in the > > codebase can be easily rewritten (and simplified) using these new > > macros. So converting all of those would likely be quite a bit of > > work. In patch 0003 I only converted a few of them to get some > > coverage of the new macros and show how much simpler the usage of them > > is. > > I'm not sure we should proceed with rewriting most/all eligible foreach > loops. I think it's fine to use the new macros in new code or to update > existing loops in passing when changing nearby code, but rewriting > everything likely just introduces back-patching pain in return for little > discernible gain. +1 for this. Let's just provide the for_each macros to be used for new code. This means that the 0003-Use-new-foreach_xyz-macros-in-a-few-places.patch will not be present in the final patch right? Regards, Vignesh
Re: Change GUC hashtable to use simplehash?
On Wed, Dec 20, 2023 at 3:23 AM Jeff Davis wrote: > > On Tue, 2023-12-19 at 16:23 +0700, John Naylor wrote: > > That wasn't the next place I thought to look (that would be the > > strcmp > > call), but something like this could be worthwhile. > > The reason I looked here is that the inner while statement (to find the > chunk size) looked out of place and possibly slow, and there's a > bitwise trick we can use instead. There are other bit tricks we can use. In v11-0005 Just for fun, I translated a couple more into C from https://github.com/openbsd/src/blob/master/lib/libc/arch/amd64/string/strlen.S From aecbc51780ada4634855727c50e3b85a8f7f Mon Sep 17 00:00:00 2001 From: John Naylor Date: Sat, 9 Dec 2023 16:24:56 +0700 Subject: [PATCH v11 3/5] Use fasthash32 for pgstat_hash_hash_key Currently this calls the 32-bit Murmur finalizer on the three elements, then joined with hash_combine(). This is simpler and has better collision guarantees. WIP: Make sure performance is at least comparable. WIP: We may not need the full 32-bit finalizer reducing step. It would be slightly cheaper to just use fasthash64 and then take the lower 32 bits. Discussion: (none yet, buried in a related patchset) --- src/include/utils/pgstat_internal.h | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h index 60fbf9394b..ecc46bef04 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -14,7 +14,7 @@ #define PGSTAT_INTERNAL_H -#include "common/hashfn.h" +#include "common/hashfn_unstable.h" #include "lib/dshash.h" #include "lib/ilist.h" #include "pgstat.h" @@ -777,15 +777,10 @@ static inline uint32 pgstat_hash_hash_key(const void *d, size_t size, void *arg) { const PgStat_HashKey *key = (PgStat_HashKey *) d; - uint32 hash; Assert(size == sizeof(PgStat_HashKey) && arg == NULL); - hash = murmurhash32(key->kind); - hash = hash_combine(hash, murmurhash32(key->dboid)); - hash = hash_combine(hash, murmurhash32(key->objoid)); - - return hash; + return fasthash32((const char *) key, size, 0); } /* -- 2.43.0 From c7bd727b24a8935343df6fb24d10948fa6d4d57c Mon Sep 17 00:00:00 2001 From: John Naylor Date: Mon, 18 Dec 2023 11:10:28 +0700 Subject: [PATCH v11 2/5] Use fasthash for the search path cache This serves to demonstrate the incremental API, allowing inlined hash calculation without a strlen call. This brings the general case performance closer to the optimization done in commit a86c61c9ee. WIP: roleid should be mixed in normally, unless we have reason to just use it as a seed. Jeff Davis, with switch to chunked interface by me Discussion: https://www.postgresql.org/message-id/b40292c99e623defe5eadedab1d438cf51a4107c.camel%40j-davis.com --- src/backend/catalog/namespace.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index 5027efc91d..7fe2fd1fd4 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -41,7 +41,7 @@ #include "catalog/pg_ts_template.h" #include "catalog/pg_type.h" #include "commands/dbcommands.h" -#include "common/hashfn.h" +#include "common/hashfn_unstable.h" #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" @@ -247,11 +247,25 @@ static bool MatchNamedCall(HeapTuple proctup, int nargs, List *argnames, static inline uint32 spcachekey_hash(SearchPathCacheKey key) { - const unsigned char *bytes = (const unsigned char *) key.searchPath; - int blen = strlen(key.searchPath); + const char *const start = key.searchPath; + const char *buf = key.searchPath; + fasthash_state hs; - return hash_combine(hash_bytes(bytes, blen), - hash_uint32(key.roleid)); + /* WIP: maybe roleid should be mixed in normally */ + fasthash_init(&hs, FH_UNKNOWN_LENGTH, key.roleid); + while (*buf) + { + int chunk_len = 0; + + while (chunk_len < FH_SIZEOF_ACCUM && buf[chunk_len] != '\0') + chunk_len++; + + fasthash_accum(&hs, buf, chunk_len); + buf += chunk_len; + } + + /* pass the length to tweak the final mix */ + return fasthash_final32(&hs, buf - start); } static inline bool -- 2.43.0 From ec447cc9a9718421883d9619e9dde1b5df3ada9c Mon Sep 17 00:00:00 2001 From: John Naylor Date: Wed, 20 Dec 2023 13:08:46 +0700 Subject: [PATCH v11 5/5] Optimize tail with inspiration from OpenBSD This only works on little endian, so add guard for that and for 64-bit. Word-at-a-time NUL checks are not worth the extra complexity for 32-bit platforms. There is an algorithm that works for big-endian, but this is all just demonstration anyway. --- src/backend/catalog/namespace.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index cb840ce9dd..2046d6788d 100644 --- a/src/backend/catalog/namespace.c +++ b/
Re: Synchronizing slots from primary to standby
Here are some comments for the patch v50-0002. == GENERAL (I made a short study of all the ereports in this patch -- here are some findings) ~~~ 0.1 Don't need the parentheses. Checking all the ereports I see that half of them have the redundant parentheses and half of them do not; You might as well make them all use the new style where the extra parentheses are not needed. e.g. + ereport(LOG, + (errmsg("skipping slot synchronization"), + errdetail("enable_syncslot is disabled."))); e.g. + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot drop replication slot \"%s\"", name), + errdetail("This slot is being synced from the primary server."))); and many more like this. Search for all the ereports. ~~~ 0.2 + ereport(LOG, + (errmsg("dropped replication slot \"%s\" of dbid %d as it " + "was not sync-ready", NameStr(s->data.name), + s->data.database))); I felt maybe that could be: errmsg("dropped replication slot \"%s\" of dbid %d", ... errdetail("It was not sync-ready.") (now this shares the same errmsg with another ereport) ~~~ 0.3. + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("skipping sync of slot \"%s\" as it is a user created" + " slot", remote_slot->name), + errdetail("This slot has failover enabled on the primary and" +" thus is sync candidate but user created slot with" +" the same name already exists on the standby."))); This seemed too wordy. Can't it be shortened (maybe like below) without losing any of the vital information? errmsg("skipping sync of slot \"%s\"", ...) errdetail("A user-created slot with the same name already exists on the standby.") ~~~ 0.4 + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("exiting from slot synchronization due to bad configuration"), + /* translator: second %s is a GUC variable name */ + errdetail("The primary slot \"%s\" specified by %s is not valid.", +PrimarySlotName, "primary_slot_name"))); /The primary slot/The primary server slot/ ~~~ 0.5 + ereport(ERROR, + (errmsg("could not fetch primary_slot_name \"%s\" info from the " + "primary: %s", PrimarySlotName, res->err))); /primary:/primary server:/ ~~~ 0.6 The continuations for long lines are inconsistent. Sometimes there are trailing spaces and sometimes there are leading spaces. And sometimes there are both at the same time which would cause double-spacing in the message! Please make them all the same. I think using leading spaces is easier but YMMV. e.g. + elog(ERROR, + "not synchronizing local slot \"%s\" LSN(%X/%X)" + " to remote slot's LSN(%X/%X) as synchronization " + " would move it backwards", remote_slot->name, + LSN_FORMAT_ARGS(slot->data.restart_lsn), + LSN_FORMAT_ARGS(remote_slot->restart_lsn)); == src/backend/replication/logical/slotsync.c 1. check_primary_info + /* No need to check further, return that we are cascading standby */ + if (remote_in_recovery) + { + *am_cascading_standby = true; + ExecClearTuple(tupslot); + walrcv_clear_result(res); + CommitTransactionCommand(); + return; + } + + valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull)); + Assert(!isnull); + + if (!valid) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("exiting from slot synchronization due to bad configuration"), + /* translator: second %s is a GUC variable name */ + errdetail("The primary slot \"%s\" specified by %s is not valid.", +PrimarySlotName, "primary_slot_name"))); + ExecClearTuple(tupslot); + walrcv_clear_result(res); + CommitTransactionCommand(); +} Now that there is a common cleanup/return code this function be reduced further like below: SUGGESTION if (remote_in_recovery) { /* No need to check further, return that we are cascading standby */ *am_cascading_standby = true; } else { /* We are a normal standby. */ valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull)); Assert(!isnull); if (!valid) ... } ExecClearTuple(tupslot); walrcv_clear_result(res); CommitTransactionCommand(); } ~~~ 2. ReplSlotSyncWorkerMain + /* + * One can promote the standby and we can no longer be a cascading + * standby. So recheck here. + */ + if (am_cascading_standby) + check_primary_info(wrconn, &am_cascading_standby); Minor rewording of that new comment. SUGGESTION If the standby was promoted then what was previously a cascading standby might no longer be one, so recheck each time. == src/test/recovery/t/050_verify_slot_order.pl 3. +## +# Test that a synchronized slot can not be decoded, altered and dropped by the user +## /and dropped/or dropped/ ~~~ 4. + +($result, $stdout, $stderr) = $standby1->psql( +'postgres', +qq[ALTER_REPLICATION_SLOT lsub1_slot (failover);], +replication => 'database'); +ok($stderr =~ /ERROR: cannot alter replication slot "lsub1_slot"/, + "synced slot on standby cannot be altered"); + Add a commen
Function to get invalidation cause of a replication slot.
Hello hackers, Attached is a patch which attempts to implement a new system function pg_get_slot_invalidation_cause('slot_name') to get invalidation cause of a replication slot. Currently, users do not have a way to know the invalidation cause. One can only find out if the slot is invalidated or not by querying the 'conflicting' field of pg_replication_slots. This new function provides a way to query invalidation cause as well. One of the use case scenarios for this function is another ongoing thread "Synchronizing slots from primary to standby" at [1]. This function is needed there to replicate invalidation-cause of a logical replication slot from the primary server to the hot standby. But this is an independent requirement in itself and thus makes sense to have it implemented separately. Please review and provide your feedback. [1]: https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23%40enterprisedb.com thanks Shveta v1-0001-Function-to-get-invalidation-cause-of-a-replicati.patch Description: Binary data
Re: Synchronizing slots from primary to standby
On Tue, Dec 19, 2023 at 6:35 PM Hayato Kuroda (Fujitsu) wrote: > > > walsender.c > > 01. WalSndWaitForStandbyConfirmation > > ``` > +sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp()); > ``` > > It works well, but I'm not sure whether we should use WalSndComputeSleeptime() > because the function won't be called by walsender. > I don't think it is correct to use this function because it is walsender specific, for example, it uses 'last_reply_timestamp' which won't be even initialized in the backend environment. We need to probably use a different logic for sleep here or need to use a hard-coded value. I think we should change the name of functions like WalSndWaitForStandbyConfirmation() as they are no longer used by walsender. IIRC, earlier, we had a common logic to wait from both walsender and SQL APIs which led to this naming but that is no longer true with the latest patch. > 02.WalSndWaitForStandbyConfirmation > > ``` > +ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, > sleeptime, > + > WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION) > ``` > > Hmm, is it OK to use the same event as WalSndWaitForWal()? IIUC it should be > avoided. > Agreed. So, how about using WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION so that we can use it both from the backend and walsender? -- With Regards, Amit Kapila.
Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
On Tue, Dec 19, 2023 at 9:14 AM Masahiko Sawada wrote: > > > The error table hub idea is still unclear to me. I assume that there > are error tables at least on each database. And an error table can > have error data that happened during COPY FROM, including malformed > lines. Do the error tables grow without bounds and the users have to > delete rows at some point? If so, who can do that? How can we achieve > that the users can see only errored rows they generated? And the issue > with logical replication also needs to be resolved. Anyway, if we go > this direction, we need to discuss the overall design. > > Regards, > > -- > Masahiko Sawada > Amazon Web Services: https://aws.amazon.com Please check my latest attached POC. Main content is to build spi query, execute the spi query, regress test and regress output. copy_errors one per schema. foo.copy_errors will be owned by the schema: foo owner. if you can insert to a table in that specific schema let's say foo, then you will get privilege to INSERT/DELETE/SELECT to foo.copy_errors. If you are not a superuser, you are only allowed to do INSERT/DELETE/SELECT on foo.copy_errors rows where USERID = current_user::regrole::oid. This is done via row level security. Since foo.copy_errors is mainly INSERT operations, if copy_errors grow too much, that means your source file has many errors, it will take a very long time to finish the whole COPY. maybe we can capture how many errors encountered in another client. I don't know how to deal with logic replication. looking for ideas. From 9affaf6d94eb4afe26fc7181e38e53eed14e0216 Mon Sep 17 00:00:00 2001 From: pgaddict Date: Wed, 20 Dec 2023 11:26:25 +0800 Subject: [PATCH v12 1/1] Make COPY FROM more error tolerant Currently COPY FROM has 3 types of error while processing the source file. * extra data after last expected column * missing data for column \"%s\" * data type conversion error. Instead of throwing errors while copying, save_error specifier will save errors to table copy_errors for all the copy from operation in the same schema. We check the existing copy_error table definition by column name and column data type. if table already exists and meets the criteria then errors will to it if the table does not exist, then create one. copy_errors is per schema, it's owned by schema's owner. for non-superusers, if you can do insert in that schema, then you can insert to copy_errors, but you are only allowed to select/delet your own rows, which is judged by current_user with copy_error's userid column. Priviledge restirction is implmented via ROW LEVEL SECURITY. Only works for COPY FROM, non-BINARY mode. While copying, if error never happened, error saving table will be dropped at the ending of COPY FROM. If the error saving table exists, meaning at least once COPY FROM errors has happened, then all the future errors will be saved to that table. We save the error related meta info to error saving table using SPI, that is construct a query string, then execute the query. --- contrib/file_fdw/file_fdw.c | 4 +- doc/src/sgml/ref/copy.sgml | 122 ++- src/backend/commands/copy.c | 12 ++ src/backend/commands/copyfrom.c | 168 - src/backend/commands/copyfromparse.c | 179 +-- src/backend/parser/gram.y| 8 +- src/bin/psql/tab-complete.c | 3 +- src/include/commands/copy.h | 3 +- src/include/commands/copyfrom_internal.h | 5 + src/include/parser/kwlist.h | 1 + src/test/regress/expected/copy2.out | 160 src/test/regress/sql/copy2.sql | 142 ++ 12 files changed, 787 insertions(+), 20 deletions(-) diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2189be8a..2d3eb34f 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -751,7 +751,7 @@ fileIterateForeignScan(ForeignScanState *node) */ oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); found = NextCopyFrom(festate->cstate, econtext, - slot->tts_values, slot->tts_isnull); + slot->tts_values, slot->tts_isnull, NULL); if (found) ExecStoreVirtualTuple(slot); @@ -1183,7 +1183,7 @@ file_acquire_sample_rows(Relation onerel, int elevel, MemoryContextReset(tupcontext); MemoryContextSwitchTo(tupcontext); - found = NextCopyFrom(cstate, NULL, values, nulls); + found = NextCopyFrom(cstate, NULL, values, nulls, NULL); MemoryContextSwitchTo(oldcontext); diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 18ecc69c..3dbf70ee 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -44,6 +44,7 @@ COPY { table_name [ ( column_name [, ...] ) | * } FORCE_NULL { ( column_name [, ...] ) | * } ENCODING 'encoding_name' +SAVE_ERROR [ boolean ] @@ -411,6 +412,18 @@ WHERE condition +
Re: Synchronizing slots from primary to standby
On Tue, Dec 19, 2023 at 5:30 PM shveta malik wrote: > > Thanks for reviewing. I have addressed these in v50. > I was looking at this patch to see if something smaller could be independently committable. I think we can extract pg_get_slot_invalidation_cause() and commit it as that function could be independently useful as well. What do you think? -- With Regards, Amit Kapila.
Re: Add a perl function in Cluster.pm to generate WAL
On Tue, Dec 19, 2023, at 8:00 PM, Michael Paquier wrote: > On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote: > > I used pg_logical_emit_message() in non-transactional mode without > > needing an explicit WAL flush as the pg_switch_wal() does a WAL flush > > at the end [1]. > > Indeed, that should be enough to answer my comment. > > > Attached v4 patch. > > LGTM, thanks. Euler, what do you think? > LGTM. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: Improve eviction algorithm in ReorderBuffer
On Wed, Dec 20, 2023 at 6:49 AM Masahiko Sawada wrote: > > On Tue, Dec 19, 2023 at 8:02 PM Amit Kapila wrote: > > > > On Tue, Dec 19, 2023 at 8:31 AM Masahiko Sawada > > wrote: > > > > > > On Sun, Dec 17, 2023 at 11:40 AM Amit Kapila > > > wrote: > > > > > > > > > > > > The individual transactions shouldn't cross > > > > 'logical_decoding_work_mem'. I got a bit confused by your proposal to > > > > maintain the lists: "...splitting it into two lists: transactions > > > > consuming 5% < and 5% >= of the memory limit, and checking the 5% >= > > > > list preferably.". In the previous sentence, what did you mean by > > > > transactions consuming 5% >= of the memory limit? I got the impression > > > > that you are saying to maintain them in a separate transaction list > > > > which doesn't seems to be the case. > > > > > > I wanted to mean that there are three lists in total: the first one > > > maintain the transactions consuming more than 10% of > > > logical_decoding_work_mem, > > > > > > > How can we have multiple transactions in the list consuming more than > > 10% of logical_decoding_work_mem? Shouldn't we perform serialization > > before any xact reaches logical_decoding_work_mem? > > Well, suppose logical_decoding_work_mem is set to 64MB, transactions > consuming more than 6.4MB are added to the list. So for example, it's > possible that the list has three transactions each of which are > consuming 10MB while the total memory usage in the reorderbuffer is > still 30MB (less than logical_decoding_work_mem). > Thanks for the clarification. I misunderstood the list to have transactions greater than 70.4 MB (64 + 6.4) in your example. But one thing to note is that maintaining these lists by default can also have some overhead unless the list of open transactions crosses a certain threshold. -- With Regards, Amit Kapila.
Re: Transaction timeout
On Wed, Dec 20, 2023 at 9:58 AM Thomas wen wrote: > > Hi Junwang Zhao > #should we invalidate lock_timeout? Or maybe just document this. > I think you mean when lock_time is greater than trasaction-time invalidate > lock_timeout or needs to be logged ? > I mean the interleaving of the gucs, which is lock_timeout and the new introduced transaction_timeout, if lock_timeout >= transaction_timeout, seems no need to enable lock_timeout. > > > > Best whish > > 发件人: Junwang Zhao > 发送时间: 2023年12月20日 9:48 > 收件人: Andrey M. Borodin > 抄送: Japin Li ; 邱宇航 ; Fujii Masao > ; Andrey Borodin ; Andres > Freund ; Michael Paquier ; > Nikolay Samokhvalov ; pgsql-hackers > ; pgsql-hackers@lists.postgresql.org > > 主题: Re: Transaction timeout > > On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao wrote: > > > > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin > > wrote: > > > > > > > > > > > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin > > > > wrote: > > > > > > > > I don’t have Windows machine, so I hope CF bot will pick this. > > > > > > I used Github CI to produce version of tests that seems to be is stable > > > on Windows. > > > Sorry for the noise. > > > > > > > > > Best regards, Andrey Borodin. > > > > + > > +If transaction_timeout is shorter than > > +idle_in_transaction_session_timeout or > > statement_timeout > > +transaction_timeout will invalidate longer > > timeout. > > + > > > > When transaction_timeout is *equal* to idle_in_transaction_session_timeout > > or statement_timeout, idle_in_transaction_session_timeout and > > statement_timeout > > will also be invalidated, the logic in the code seems right, though > > this document > > is a little bit inaccurate. > > > > Unlike statement_timeout, this timeout can only > occur > while waiting for locks. Note that if > statement_timeout > is nonzero, it is rather pointless to set > lock_timeout to > the same or larger value, since the statement timeout would always > trigger first. If log_min_error_statement is set > to > ERROR or lower, the statement that timed out will > be > logged. > > > There is a note about statement_timeout and lock_timeout, set both > and lock_timeout >= statement_timeout is pointless, but this logic seems not > implemented in the code. I am wondering if lock_timeout >= > transaction_timeout, > should we invalidate lock_timeout? Or maybe just document this. > > > -- > > Regards > > Junwang Zhao > > > > -- > Regards > Junwang Zhao > > -- Regards Junwang Zhao
Re: "pgoutput" options missing on documentation
On Tue, Dec 19, 2023 at 12:55 PM Amit Kapila wrote: > > I think we should move to 0002 patch now. In that, I suggest preparing > separate back branch patches. > Emre, are you planning to share back-branch patches? -- With Regards, Amit Kapila.
Re: pgsql: Move src/bin/pg_verifybackup/parse_manifest.c into src/common.
On Tue, Dec 19, 2023 at 8:01 PM Michael Paquier wrote: > On Tue, Dec 19, 2023 at 08:29:18PM +, Robert Haas wrote: > > Move src/bin/pg_verifybackup/parse_manifest.c into src/common. > > > > This makes it possible for the code to be easily reused by other > > client-side tools, and/or by the server. > > > > Patch by me. Review of this patch in particular by at least Peter > > Eisentraut; reviewers for the patch series in general include Dilip > > Kumar, Andres Fruend, David Steele, Álvaro Herrera, and Jakub Wartak. > > Worth noting that this has forgotten to update @pgcommonallfiles in > Mkvcbuild.pm so this would have failed a build with src/build/msvc/. > Or you counted on me here, relying on the scripts to be gone? ;p To be honest, I counted on CI to tell me whether it still builds everywhere. -- Robert Haas EDB: http://www.enterprisedb.com
回复: Transaction timeout
Hi Junwang Zhao #should we invalidate lock_timeout? Or maybe just document this. I think you mean when lock_time is greater than trasaction-time invalidate lock_timeout or needs to be logged ? Best whish 发件人: Junwang Zhao 发送时间: 2023年12月20日 9:48 收件人: Andrey M. Borodin 抄送: Japin Li ; 邱宇航 ; Fujii Masao ; Andrey Borodin ; Andres Freund ; Michael Paquier ; Nikolay Samokhvalov ; pgsql-hackers ; pgsql-hackers@lists.postgresql.org 主题: Re: Transaction timeout On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao wrote: > > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin > wrote: > > > > > > > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin wrote: > > > > > > I don’t have Windows machine, so I hope CF bot will pick this. > > > > I used Github CI to produce version of tests that seems to be is stable on > > Windows. > > Sorry for the noise. > > > > > > Best regards, Andrey Borodin. > > + > +If transaction_timeout is shorter than > +idle_in_transaction_session_timeout or > statement_timeout > +transaction_timeout will invalidate longer > timeout. > + > > When transaction_timeout is *equal* to idle_in_transaction_session_timeout > or statement_timeout, idle_in_transaction_session_timeout and > statement_timeout > will also be invalidated, the logic in the code seems right, though > this document > is a little bit inaccurate. > Unlike statement_timeout, this timeout can only occur while waiting for locks. Note that if statement_timeout is nonzero, it is rather pointless to set lock_timeout to the same or larger value, since the statement timeout would always trigger first. If log_min_error_statement is set to ERROR or lower, the statement that timed out will be logged. There is a note about statement_timeout and lock_timeout, set both and lock_timeout >= statement_timeout is pointless, but this logic seems not implemented in the code. I am wondering if lock_timeout >= transaction_timeout, should we invalidate lock_timeout? Or maybe just document this. > -- > Regards > Junwang Zhao -- Regards Junwang Zhao
Re: Transaction timeout
On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao wrote: > > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin > wrote: > > > > > > > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin wrote: > > > > > > I don’t have Windows machine, so I hope CF bot will pick this. > > > > I used Github CI to produce version of tests that seems to be is stable on > > Windows. > > Sorry for the noise. > > > > > > Best regards, Andrey Borodin. > > + > +If transaction_timeout is shorter than > +idle_in_transaction_session_timeout or > statement_timeout > +transaction_timeout will invalidate longer > timeout. > + > > When transaction_timeout is *equal* to idle_in_transaction_session_timeout > or statement_timeout, idle_in_transaction_session_timeout and > statement_timeout > will also be invalidated, the logic in the code seems right, though > this document > is a little bit inaccurate. > Unlike statement_timeout, this timeout can only occur while waiting for locks. Note that if statement_timeout is nonzero, it is rather pointless to set lock_timeout to the same or larger value, since the statement timeout would always trigger first. If log_min_error_statement is set to ERROR or lower, the statement that timed out will be logged. There is a note about statement_timeout and lock_timeout, set both and lock_timeout >= statement_timeout is pointless, but this logic seems not implemented in the code. I am wondering if lock_timeout >= transaction_timeout, should we invalidate lock_timeout? Or maybe just document this. > -- > Regards > Junwang Zhao -- Regards Junwang Zhao
Re: Add isCatalogRel in rmgrdesc
On Tue, Dec 19, 2023 at 6:27 PM Drouvot, Bertrand wrote: > > Hi, > > On 12/19/23 9:00 AM, Masahiko Sawada wrote: > > On Tue, Dec 12, 2023 at 6:15 PM Michael Paquier wrote: > >> > >> On Tue, Dec 12, 2023 at 09:23:46AM +0100, Drouvot, Bertrand wrote: > >>> Please find attached a patch to add this field in the related rmgrdesc > >>> (i.e > >>> all the ones that already provide the snapshotConflictHorizon except the > >>> one > >>> related to xl_heap_visible: indeed a new bit was added in its flag field > >>> in 6af1793954 > >>> instead of adding the isCatalogRel bool). > >>> > >>> I think it's worth it, as this new field could help diagnose conflicts > >>> issues (if any). > > > > +1 > > Thanks for looking at it! > > > > > - appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon > > %u:%u", > > + appendStringInfo(buf, "rel %u/%u/%u; blk %u; > > snapshotConflictHorizon %u:%u, isCatalogRel %u", > > xlrec->locator.spcOid, xlrec->locator.dbOid, > > xlrec->locator.relNumber, xlrec->block, > > > > EpochFromFullTransactionId(xlrec->snapshotConflictHorizon), > > - > > XidFromFullTransactionId(xlrec->snapshotConflictHorizon)); > > + > > XidFromFullTransactionId(xlrec->snapshotConflictHorizon), > > +xlrec->isCatalogRel); > > > > The patch prints isCatalogRel, a bool field, as %u. But other rmgrdesc > > implementations seem to use different ways. For instance in spgdesc.c, > > we print flag name if it's set: (newPage and postfixBlkSame are bool > > fields): > > > > appendStringInfo(buf, "prefixoff: %u, postfixoff: %u", > > xlrec->offnumPrefix, > > xlrec->offnumPostfix); > > if (xlrec->newPage) > > appendStringInfoString(buf, " (newpage)"); > > if (xlrec->postfixBlkSame) > > appendStringInfoString(buf, " (same)"); > > > > whereas in hashdesc.c, we print either 'T' of 'F': > > > > appendStringInfo(buf, "clear_dead_marking %c, is_primary > > %c", > > xlrec->clear_dead_marking ? 'T' : 'F', > > xlrec->is_primary_bucket_page ? 'T' : > > 'F'); > > > > Is it probably worth considering such formats? > > Good point, let's not add another format. > > > I prefer to always > > print the field name like the current patch and hashdesc.c since it's > > easier to parse it. > > I like this format too, so done that way in v2 attached. > > BTW, I noticed that sometimes the snapshotConflictHorizon is displayed > as "snapshotConflictHorizon:" and sometimes as "snapshotConflictHorizon". > > So v2 is doing the same, means using "isCatalogRel:" if > "snapshotConflictHorizon:" > is being used or using "isCatalogRel" if not. Agreed. Thank you for updating the patch. The v2 patch looks good to me. I'll push it, barring any objections. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: index prefetching
On 12/18/23 22:00, Robert Haas wrote: > On Sat, Dec 9, 2023 at 1:08 PM Tomas Vondra > wrote: >> But there's a layering problem that I don't know how to solve - I don't >> see how we could make indexam.c entirely oblivious to the prefetching, >> and move it entirely to the executor. Because how else would you know >> what to prefetch? > > Yeah, that seems impossible. > > Some thoughts: > > * I think perhaps the subject line of this thread is misleading. It > doesn't seem like there is any index prefetching going on here at all, > and there couldn't be, unless you extended the index AM API with new > methods. What you're actually doing is prefetching heap pages that > will be needed by a scan of the index. I think this confusing naming > has propagated itself into some parts of the patch, e.g. > index_prefetch() reads *from the heap* which is not at all clear from > the comment saying "Prefetch the TID, unless it's sequential or > recently prefetched." You're not prefetching the TID: you're > prefetching the heap tuple to which the TID points. That's not an > academic distinction IMHO -- the TID would be stored in the index, so > if we were prefetching the TID, we'd have to be reading index pages, > not heap pages. Yes, that's a fair complaint. I think the naming is mostly obsolete - the prefetching initially happened way way lower - in the index AMs. It was prefetching the heap pages, ofc, but it kinda seemed reasonable to call it "index prefetching". And even now it's called from indexam.c where most functions start with "index_". But I'll think about some better / cleared name. > > * Regarding layering, my first thought was that the changes to > index_getnext_tid() and index_getnext_slot() are sensible: read ahead > by some number of TIDs, keep the TIDs you've fetched in an array > someplace, use that to drive prefetching of blocks on disk, and return > the previously-read TIDs from the queue without letting the caller > know that the queue exists. I think that's the obvious design for a > feature of this type, to the point where I don't really see that > there's a viable alternative design. I agree. > Driving something down into the individual index AMs would make sense > if you wanted to prefetch *from the indexes*, but it's unnecessary > otherwise, and best avoided. > Right. In fact, the patch moved exactly in the opposite direction - it was originally done at the AM level, and moved up. First to indexam.c, then even more to the executor. > * But that said, the skip_all_visible flag passed down to > index_prefetch() looks like a VERY strong sign that the layering here > is not what it should be. Right now, when some code calls > index_getnext_tid(), that function does not need to know or care > whether the caller is going to fetch the heap tuple or not. But with > this patch, the code does need to care. So knowledge of the executor > concept of an index-only scan trickles down into indexam.c, which now > has to be able to make decisions that are consistent with the ones > that the executor will make. That doesn't seem good at all. > I agree the all_visible flag is a sign the abstraction is not quite right. I did that mostly to quickly verify whether the duplicate VM checks are causing for the perf regression (and they are). Whatever the right abstraction is, it probably needs to do these VM checks only once. > * I think it might make sense to have two different prefetching > schemes. Ideally they could share some structure. If a caller is using > index_getnext_slot(), then it's easy for prefetching to be fully > transparent. The caller can just ask for TIDs and the prefetching > distance and TID queue can be fully under the control of something > that is hidden from the caller. But when using index_getnext_tid(), > the caller needs to have an opportunity to evaluate each TID and > decide whether we even want the heap tuple. If yes, then we feed that > TID to the prefetcher; if no, we don't. That way, we're not > replicating executor logic in lower-level code. However, that also > means that the IOS logic needs to be aware that this TID queue exists > and interact with whatever controls the prefetch distance. Perhaps > after calling index_getnext_tid() you call > index_prefetcher_put_tid(prefetcher, tid, bool fetch_heap_tuple) and > then you call index_prefetcher_get_tid() to drain the queue. Perhaps > also the prefetcher has a "fill" callback that gets invoked when the > TID queue isn't as full as the prefetcher wants it to be. Then > index_getnext_slot() can just install a trivial fill callback that > says index_prefetecher_put_tid(prefetcher, index_getnext_tid(...), > true), but IOS can use a more sophisticated callback that checks the > VM to determine what to pass for the third argument. > Yeah, after you pointed out the "leaky" abstraction, I also started to think about customizing the behavior using a callback. Not sure what exactly you mean by "fully transparent" but as I e
Re: Improve eviction algorithm in ReorderBuffer
On Tue, Dec 19, 2023 at 8:02 PM Amit Kapila wrote: > > On Tue, Dec 19, 2023 at 8:31 AM Masahiko Sawada wrote: > > > > On Sun, Dec 17, 2023 at 11:40 AM Amit Kapila > > wrote: > > > > > > > > > The individual transactions shouldn't cross > > > 'logical_decoding_work_mem'. I got a bit confused by your proposal to > > > maintain the lists: "...splitting it into two lists: transactions > > > consuming 5% < and 5% >= of the memory limit, and checking the 5% >= > > > list preferably.". In the previous sentence, what did you mean by > > > transactions consuming 5% >= of the memory limit? I got the impression > > > that you are saying to maintain them in a separate transaction list > > > which doesn't seems to be the case. > > > > I wanted to mean that there are three lists in total: the first one > > maintain the transactions consuming more than 10% of > > logical_decoding_work_mem, > > > > How can we have multiple transactions in the list consuming more than > 10% of logical_decoding_work_mem? Shouldn't we perform serialization > before any xact reaches logical_decoding_work_mem? Well, suppose logical_decoding_work_mem is set to 64MB, transactions consuming more than 6.4MB are added to the list. So for example, it's possible that the list has three transactions each of which are consuming 10MB while the total memory usage in the reorderbuffer is still 30MB (less than logical_decoding_work_mem). Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Remove MSVC scripts from the tree
On Tue, Dec 19, 2023 at 04:24:02PM +0100, Peter Eisentraut wrote: > Here are patches for these two issues. More testing would be appreciated. > > --- a/contrib/basebackup_to_shell/meson.build > +++ b/contrib/basebackup_to_shell/meson.build > @@ -24,7 +24,7 @@ tests += { > 'tests': [ >'t/001_basic.pl', > ], > -'env': {'GZIP_PROGRAM': gzip.path(), > -'TAR': tar.path()}, > +'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', > +'TAR': tar.found() ? tar.path() : '' }, >}, Hmm. Interesting. So this basically comes down to the fact that GZIP and TAR are required in ./configure because distcheck has a hard dependency on both, but we don't support this target in meson. Is that right? -- Michael signature.asc Description: PGP signature
Re: Remove MSVC scripts from the tree
On Mon, Nov 20, 2023 at 05:03:28PM +0900, Michael Paquier wrote: > Your suggestion to create a new sect2 for "Windows" as much as Andres' > suggestion are OK by as an intermediate step, and I suspect that the > end result will likely not be that. It took me some time to get back to this one, and just applied the patch removing the scripts. At the end, I have gone with the addition of a subsection named "Visual" for now in the platform-specific notes, keeping all the information originally in install-windows.sgml the same. A proposal of patch to clean up the docs is on my TODO list for the next CF. -- Michael signature.asc Description: PGP signature
Re: Use of additional index columns in rows filtering
On 8/19/23 02:49, Peter Geoghegan wrote: > On Tue, Aug 8, 2023 at 11:36 AM Peter Geoghegan wrote: >>> The only thing the patch does is it looks at clauses we decided not to >>> treat as index quals, and do maybe still evaluate them on index. And I >>> don't think I want to move these goalposts much further. >> >> Avoiding the need for visibility checks completely (in at least a >> subset of cases) was originally your idea. I'm not going to insist on >> it, or anything like that. It just seems like something that'll add a >> great deal of value over what you have already. > > Another idea in this area occurred to me today: it would be cool if > non-key columns from INCLUDE indexes could completely avoid visibility > checks (not just avoid heap accesses using the visibility map) in > roughly the same way that we'd expect with an equivalent key column > already, today (if it was an index filter index qual). Offhand I think > that it would make sense to do that outside of index AMs, by extending > the mechanism from Tomas' patch to this special class of expression. > We'd invent some other class of index filter that "outranks" > conventional index filters when the optimizer can safely determine > that they're "index filters with the visibility characteristics of > true index quals". I am mostly thinking of simple, common cases here > (e.g., an equality operator + constant). > > This might require the involvement of the relevant btree opclass, > since that's where the no-visibility-check-safety property actually > comes from. However, it wouldn't need to be limited to INCLUDE B-Tree > indexes. You could for example do this with a GiST INCLUDE index that > had no opclass information about the datatype/operator. That'd be a > natural advantage of an approach based on index filters. > I haven't really thought about INCLUDE columns at all. I agree it seems doable and possibly quite useful, and doing it outside the index AM seems about right. The index does not know about the opclass for these columns, it just stores them, why/how should it be responsible to do something with it? And as you said, it seems like a (fairly natural?) extension of the patch discussed in this thread. That being said, I don't intend to make this work in v1. Once the other issues get solved in some way, I may try hacking a WIP, but mostly to try if there's not some design issue that'd make it hard in the future. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Update the comment in nodes.h to cover Cardinality
On Tue, Dec 19, 2023 at 10:50 PM Peter Eisentraut wrote: > On 19.12.23 07:23, Richard Guo wrote: > > By chance I discovered that the comment for the typedefs of "double"s > > does not cover Cardinality. Should we update that comment accordingly, > > maybe something like below? > > > > - * Typedefs for identifying qualifier selectivities and plan costs as > such. > > - * These are just plain "double"s, but declaring a variable as > Selectivity > > - * or Cost makes the intent more obvious. > > + * Typedefs for identifying qualifier selectivities, plan costs and > > + * estimated rows or other count as such. These are just plain > "double"s, > > + * but declaring a variable as Selectivity, Cost or Cardinality makes > the > > + * intent more obvious. > > Fixed, thanks. Thanks for the fix! Thanks Richard
Re: Use of additional index columns in rows filtering
Hi, It took me a while but I finally got back to working on this patch. I reread the summary of principles Peter shared in August, and I do agree with all the main points - the overall qual hierarchy and the various examples and counter-examples. I'll respond to a couple things in this sub-thread, and then I plan to post an updated version of the patch with a discussion of a couple problems I still haven't managed to solve (not necessarily new ones). On 8/19/23 00:19, Peter Geoghegan wrote: > On Thu, Aug 17, 2023 at 4:29 PM Jeff Davis wrote: >> On Wed, 2023-08-09 at 17:14 -0700, Peter Geoghegan wrote: >>> + Index quals are better than equivalent index filters because bitmap >>> index scans can only use index quals >> >> It seems there's consensus that: >> >> * Index Filters (Tomas's patch and the topic of this thread) are more >> general, because they can work on any expression, e.g. 1/x, which can >> throw an error if x=0. > > Agreed, but with one small caveat: they're not more general when it > comes to bitmap index scans, which seem like they'll only ever be able > to support index quals. But AFAICT that's the one and only exception. > Yeah. Some conditions are never gonna be translated into proper index quals, either because it's not really possible or because no one just implemented that. FWIW it's not immediately obvious to me if bitmap index scans are unable to support index filters because of some inherent limitation, or whether it's something we could implement. Some index AMs simply can't support index filters, because they don't know the "full" index tuple tuple (e.g. GIN), but maybe other AMs could support that ... >> * Index quals are more optimizable, because such operators are not >> supposed to throw errors or have side effects, so we can evaluate them >> before determining visibility. > > Right. Though there is a second reason: the index AM can use them to > navigate the index more efficiently with true index quals. At least in > the case of SAOPs, and perhaps in other cases in the future. > >> I wouldn't describe one as "better" than the other, but I assume you >> meant "more optimizable". > > The use of the term "better" is actually descriptive of Tomas' patch. > It is structured in a way that conditions the use of index filters on > the absence of equivalent index quals for eligible/indexed clauses. So > it really does make sense to think of it as a "qual hierarchy" (to use > Tomas' term). > > It's also true that it will always be fundamentally impossible to use > index quals for a significant proportion of all clause types, so > "better when feasible at all" is slightly more accurate. > Yeah, I agree with all of this too. I think that in all practical cases an index qual is strictly better than an index filter, for exactly these reasons. >> Is there any part of the design here that's preventing this patch from >> moving forward? If not, what are the TODOs for the current patch, or is >> it just waiting on more review? > > Practically speaking, I have no objections to moving ahead with this. > I believe that the general structure of the patch makes sense, and I > don't expect Tomas to do anything that wasn't already expected before > I chimed in. > I agree the general idea is sound. The main "problem" in the original patch was about costing changes and the implied risk of picking the wrong plan (instead of e.g. a bitmap index scan). That's pretty much what the "risk" in example (4) in the principles.txt is about, AFAICS. I think the consensus is that if we don't change the cost, this issue mostly goes away - we won't pick index scan in cases where we'd pick a different plan without the patch, and for index scans it's (almost) always correct to replace "table filter" with "index filter". If that issue goes away, I think there are two smaller ones - picking which conditions to promote to index filters, and actually tweaking the index scan code to do that. The first issue seems similar to the costing issue - in fact, it really is a costing problem. The goal is to minimize the amount of work needed to evaluate the conditions on all rows, and if we knew selectivity+cost for each condition, it'd be a fairly simple matter. But in practice we don't know this - the selectivity estimates are rather shaky (and doubly so for correlated conditions), and the procedure cost estimates are quite crude too ... The risk is we might move "forward" very expensive condition. Imagine a condition with procost=100, which would normally be evaluated last after all other clauses. But if we promote it to an index filter, that'd no longer be true. What Jeff proposed last week was roughly this: - find min(procost) for clauses that can't be index filters - consider all clauses with procost <= min(procost) as index filters But after looking at this I realized order_qual_clauses() does a very similar thing for regular clauses, and the proposed logic contradicts the heuristics order_qua
Re: Built-in CTYPE provider
On Tue, 2023-12-19 at 15:59 -0500, Robert Haas wrote: > FWIW, the idea that we're going to develop a built-in provider seems > to be solid, for the reasons Jeff mentions: it can be stable, and > under our control. But it seems like we might need built-in providers > for everything rather than just CTYPE to get those advantages, and I > fear we'll get sucked into needing a lot of tailoring rather than > just > being able to get by with one "vanilla" implementation. For the database default collation, I suspect a lot of users would jump at the chance to have "vanilla" semantics. Tailoring is more important for individual collation objects than for the database-level collation. There are reasons you might select a tailored database collation, like if the set of users accessing it are mostly from a single locale, or if the application connected to the database is expecting it in a certain form. But there are a lot of users for whom neither of those things are true, and it makes zero sense to order all of the text indexes in the database according to any one particular locale. I think these users would prioritize stability and performance for the database collation, and then use COLLATE clauses with ICU collations where necessary. The question for me is how good the "vanilla" semantics need to be to be useful as a database-level collation. Most of the performance and stability problems come from collation, so it makes sense to me to provide a fast and stable memcmp collation paired with richer ctype semantics (as proposed here). Users who want something more probably want the Unicode "root" collation, which can be provided by ICU today. I am also still concerned that we have the wrong defaults. Almost nobody thinks libc is a great provider, but that's the default, and there were problems trying to change that default to ICU in 16. If we had a builtin provider, that might be a better basis for a default (safe, fast, always available, and documentable). Then, at least if someone picks a different locale at initdb time, they would be doing so intentionally, rather than implicitly accepting index corruption risks based on an environment variable. Regards, Jeff Davis
Re: introduce dynamic shared memory registry
On Tue, Dec 19, 2023 at 10:14:44AM -0600, Nathan Bossart wrote: > On Tue, Dec 19, 2023 at 10:49:23AM -0500, Robert Haas wrote: >> On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov >> wrote: >>> 2. I think a separate file for this feature looks too expensive. >>> According to the gist of that code, it is a part of the DSA module. >> >> -1. I think this is a totally different thing than DSA. More files >> aren't nearly as expensive as the confusion that comes from smushing >> unrelated things together. > > Agreed. I think there's a decent chance that more functionality will be > added to this registry down the line, in which case it will be even more > important that this stuff stays separate from the tools it is built with. +1 for keeping a clean separation between both. -- Michael signature.asc Description: PGP signature
Re: introduce dynamic shared memory registry
On Tue, Dec 19, 2023 at 10:19:11AM -0600, Nathan Bossart wrote: > On Fri, Dec 08, 2023 at 04:36:52PM +0900, Michael Paquier wrote: >> Yes, tracking that in a more central way can have many usages, so your >> patch sounds like a good idea. Note that we have one case in core >> that be improved and make use of what you have here: autoprewarm.c. > > I'll add a patch for autoprewarm.c. Even if we don't proceed with that > change, it'll be a good demonstration. Cool, thanks. It could just be a separate change on top of the main one. > > +dsm_registry_init_or_attach(const char *key, void **ptr, size_t size, > > + void (*init_callback) (void *ptr)) > > > > This is shaped around dshash_find_or_insert(), but it looks like you'd > > want an equivalent for dshash_find(), as well. > > What is the use-case for only verifying the existence of a segment? One case I was thinking about is parallel aggregates that can define combining and serial/deserial functions, where some of the operations could happen in shared memory, requiring a DSM, and where each process doing some aggregate combining would expect a DSM to exist before making use of it. So a registry wrapper for dshash_find() could be used as a way to perform sanity checks with what's stored in the registry. -- Michael signature.asc Description: PGP signature
Re: Remove MSVC scripts from the tree
On Mon, Dec 18, 2023 at 02:52:41PM +0100, Peter Eisentraut wrote: > On 18.12.23 11:49, vignesh C wrote: >> Few comments: >> 1) Now that the MSVC build scripts are removed, should we have the >> reference to "MSVC build scripts" here? >> ltree.h: > > I think this note is correct and can be kept, as it explains the historical > context. Yeah, that's something I was pondering about for a bit a few weeks ago but keeping the historical context is still the most important piece to me. >> 2) I had seen that if sed/gzip is not available meson build will fail: >> 2.a) >> Program gsed sed found: NO >> meson.build:334:6: ERROR: Program 'gsed sed' not found or not executable > > Yes, this would need to be improved. Currently, sed is only required if > either selinux or dtrace is enabled, which isn't supported on Windows. But > we should adjust the build scripts to not fail the top-level setup run > unless those options are enabled. > >> 2.b) >> Program gzip found: NO >> meson.build:337:7: ERROR: Program 'gzip' not found or not executable > > gzip is only required for certain test suites, so again we should adjust the > build scripts to not fail the build but instead skip the tests as > appropriate. Oops. -- Michael signature.asc Description: PGP signature
Re: Fix a comment in basic_archive about NO_INSTALLCHECK
On Mon, Dec 18, 2023 at 11:28:46AM +0530, Bharath Rupireddy wrote: > Reworded the comment a bit and attached the v2 patch. Forgot about this one, thanks! I've simplified it a bit and applied it. -- Michael signature.asc Description: PGP signature
Re: "pgoutput" options missing on documentation
On Tue, Dec 19, 2023 at 6:25 PM Amit Kapila wrote: > > On Tue, Dec 19, 2023 at 12:07 PM Peter Smith wrote: > > > > On Tue, Dec 19, 2023 at 1:35 AM Emre Hasegeli wrote: > > > > > > > Fair enough. I think we should push your first patch only in HEAD as > > > > this is a minor improvement over the current behaviour. What do you > > > > think? > > > > > > I agree. > > > > Patch 0001 > > > > AFAICT parse_output_parameters possible errors are never tested. For > > example, there is no code coverage [1] touching any of these ereports. > > > > IMO there should be some simple test cases -- I am happy to create > > some tests if you agree they should exist. > > > > I don't think having tests for all sorts of error checking will add > much value as compared to the overhead they bring. > > > ~~~ > > > > While looking at the function parse_output_parameters() I noticed that > > if an unrecognised option is passed the function emits an elog instead > > of an ereport > > > > We don't expect unrecognized option here and for such a thing, we use > elog in the code. See the similar usage in > parseCreateReplSlotOptions(). > IIUC the untranslated elog should be used for internal/sanity errors, debugging, or stuff that cannot happen under any normal circumstances. While that may be the case for parseCreateReplSlotOptions() mentioned, IMO the scenario in the parse_output_parameters() is very different, because these options can come directly from user input so any user typo can cause this error. Indeed, this is probably one of the more likely reasons for getting any error in parse_output_parameters() function. I thought any errors that can be easily caused by some user actions ought to be translated. For example, the user accidentally misspells 'proto_version': test_pub=# SELECT * FROM pg_logical_slot_get_changes('test_slot_v1', NULL, NULL, 'protocol_version', '1'); ERROR: unrecognized pgoutput option: protocol_version CONTEXT: slot "test_slot_v1", output plugin "pgoutput", in the startup callback == Kind Regards, Peter Smith. Fujitsu Australia
Re: Add a perl function in Cluster.pm to generate WAL
On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote: > I used pg_logical_emit_message() in non-transactional mode without > needing an explicit WAL flush as the pg_switch_wal() does a WAL flush > at the end [1]. Indeed, that should be enough to answer my comment. > Attached v4 patch. LGTM, thanks. Euler, what do you think? -- Michael signature.asc Description: PGP signature
Re: Built-in CTYPE provider
On Mon, Dec 18, 2023 at 2:46 PM Jeff Davis wrote: > The whole concept of "providers" is that they aren't consistent with > each other. ICU, libc, and the builtin provider will all be based on > different versions of Unicode. That's by design. > > The built-in provider will be a bit better in the sense that it's > consistent with the normalization functions, and the other providers > aren't. FWIW, the idea that we're going to develop a built-in provider seems to be solid, for the reasons Jeff mentions: it can be stable, and under our control. But it seems like we might need built-in providers for everything rather than just CTYPE to get those advantages, and I fear we'll get sucked into needing a lot of tailoring rather than just being able to get by with one "vanilla" implementation. -- Robert Haas EDB: http://www.enterprisedb.com
Add support for __attribute__((returns_nonnull))
Here is a patch which adds support for the returns_nonnull attribute alongside all the other attributes we optionally support. I recently wound up in a situation where I was checking for NULL return values of a function that couldn't ever return NULL because the inability to allocate memory was always elog(ERROR)ed (aborted). I didn't go through and mark anything, but I feel like it could be useful for people going forward, including myself. -- Tristan Partin Neon (https://neon.tech) From 15a36d68519b332e7ae970708399744cbc69c6c3 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 19 Dec 2023 14:39:03 -0600 Subject: [PATCH v1] Add support for __attribute__((returns_nonnull)) Allows for marking functions that can't possibly return NULL, like those that always elog(ERROR) for instance in the case of failures. --- src/include/c.h | 12 1 file changed, 12 insertions(+) diff --git a/src/include/c.h b/src/include/c.h index 26bf7ec16e..e3a127f954 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -285,6 +285,18 @@ #define pg_unreachable() abort() #endif +/* + * Place on functions which return a pointer but can't return NULL. When used, + * it can allow the compiler to warn if a NULL check occurs in the parent + * function because that NULL check would always fail. It is also an opportunity + * to help the compiler with optimizations. + */ +#if __has_attribute (returns_nonnull) +#define pg_returns_nonnull __attribute__((returns_nonnull)) +#else +#define pg_returns_nonnull +#endif + /* * Hints to the compiler about the likelihood of a branch. Both likely() and * unlikely() return the boolean value of the contained expression. -- Tristan Partin Neon (https://neon.tech)
Re: Change GUC hashtable to use simplehash?
On Tue, 2023-12-19 at 16:23 +0700, John Naylor wrote: > That wasn't the next place I thought to look (that would be the > strcmp > call), but something like this could be worthwhile. The reason I looked here is that the inner while statement (to find the chunk size) looked out of place and possibly slow, and there's a bitwise trick we can use instead. My original test case is a bit too "macro" of a benchmark at this point, so I'm not sure it's a good guide for these individual micro- optimizations. Regards, Jeff Davis
Re: pgsql: Prevent tuples to be marked as dead in subtransactions on standb
On Tue, Dec 12, 2023 at 11:06 AM Michael Paquier wrote: > Prevent tuples to be marked as dead in subtransactions on standbys I don't think this is a good commit message. It's totally unclear what it means, and when I opened up the diff to try to see what was changed, it looked nothing like what I expected. I think a better message would have been something like "startedInRecovery flag must be propagated to subtransactions". And I think there should have been some analysis in the commit message or the comments within the commit itself of whether it was intended that this be propagated to subtransactions or not. It's hard to understand why the flag would have been placed in the TransactionState if it applied globally to the transaction and all subtransactions, but maybe that's what happened. Instead of discussing that issue, your commit message focuses in the user-visible consequences, but in a sort of baffling way. The statement that "Dead tuples are ignored and are not marked as dead during recovery," for example, is clearly false on its face. If recovery didn't mark dead tuples as dead, it would be completely broken. -- Robert Haas EDB: http://www.enterprisedb.com
Re: backtrace_on_internal_error
On Tue, Dec 19, 2023 at 11:29 AM Tom Lane wrote: > IMO, we aren't really going to get a massive payoff from this with > the current backtrace output; it's just not detailed enough. It's > better than nothing certainly, but to really move the goalposts > we'd need something approaching gdb's "bt full" output. I wonder > if it'd be sane to try to auto-invoke gdb. That's just blue sky > for now, though. In the meantime, I agree with the proposal as it > stands (that is, auto-backtrace on any XX000 error). We'll soon find > out whether it's useless, or needs more detail to be really helpful, > or is just right as it is. Once we have some practical experience > with it, we can course-correct as needed. That all seems fair to me. I'm more optimistic than you are about getting something useful out of the current backtrace output, but (1) I could be wrong, (2) I'd still like to have something better, and (3) improving the backtrace output is a separate project from including backtraces more frequently. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
> I believe this patch series is ready for detailed review/testing, with one > caveat: as can be seen here https://cirrus-ci.com/build/6732518292979712 , > the build is passing on all platforms and all tests except for the primary > SSL test on Windows. One correction: I apparently missed a kerberos timeout failure on freebsd with compression enabled (being color blind the checkmark and still running colors are awfully similar, and I misread what I saw). I haven't yet successfully reproduced that one, so I may or may not need some pointers to sort it out, but I think whatever it is the fix will be small enough that the patch overall is still reviewable.
Re: Add --check option to pgindent
On Tue Dec 19, 2023 at 10:36 AM CST, Jelte Fennema-Nio wrote: On Mon, 18 Dec 2023 at 22:18, Tristan Partin wrote: > Here is an additional patch which implements the behavior you described. > The first patch is just Daniel's squashed version of my patches. I think we'd still want the early exit behaviour when only --check is provided. No need to spend time checking more files if you're not doing anything else. Good point. Patch looks good. On Mon, 18 Dec 2023 at 22:34, Tristan Partin wrote: > To me, the two options seem at odds, like you either check or write. How > would you feel about just capturing the diffs that are printed and > patch(1)-ing them? I tried capturing the diffs and patching them, but simply piping the pgindent output to patch(1) didn't work because the pipe loses the exit code of pgindent. -o pipefail would help with this, but it's not available in all shells. Also then it's suddenly unclear if pgident failed or if patch failed. I was envisioning something along the lines of: pgindent --check --diff > patches.txt status=$? patch https://neon.tech)
Re: Add new for_each macros for iterating over a List that do not require ListCell pointer
On Tue, 19 Dec 2023 at 16:52, Nathan Bossart wrote: > I'm not sure we should proceed with rewriting most/all eligible foreach > loops. I think it's fine to use the new macros in new code or to update > existing loops in passing when changing nearby code, but rewriting > everything likely just introduces back-patching pain in return for little > discernible gain. To clarify: I totally agree that if we're not backpatching this we shouldn't do bulk changes on existing loops to avoid pain when backpatching other patches. > Unless there's some way to argue this is a bug, security issue, or data > corruption problem [0], I seriously doubt we will back-patch this. In the past some tooling changes have been backpatched, e.g. isolationtester has received various updates over the years (I know because this broke Citus its isolationtester tests a few times because the output files changed slightly). In some sense this patch could be considered tooling too. Again: not saying we should back-patch this, but we could only realistically bulk update loops if we do.
Re: Remove MSVC scripts from the tree
Hi, On Tue, 19 Dec 2023 at 18:24, Peter Eisentraut wrote: > > On 18.12.23 14:52, Peter Eisentraut wrote: > >> 2) I had seen that if sed/gzip is not available meson build will fail: > >> 2.a) > >> Program gsed sed found: NO > >> meson.build:334:6: ERROR: Program 'gsed sed' not found or not executable > > > > Yes, this would need to be improved. Currently, sed is only required if > > either selinux or dtrace is enabled, which isn't supported on Windows. > > But we should adjust the build scripts to not fail the top-level setup > > run unless those options are enabled. > > > >> 2.b) > >> Program gzip found: NO > >> meson.build:337:7: ERROR: Program 'gzip' not found or not executable > > > > gzip is only required for certain test suites, so again we should adjust > > the build scripts to not fail the build but instead skip the tests as > > appropriate. > > Here are patches for these two issues. More testing would be appreciated. 0001-meson-Require-sed-only-when-needed: +sed = find_program(get_option('SED'), 'sed', native: true, + required: get_option('dtrace').enabled() or get_option('selinux').enabled()) dtrace is disabled as default but selinux is set to auto. So, meson could find selinux ( because of the auto ) and fail to find sed, then compilation will fail with: contrib/sepgsql/meson.build:34:19: ERROR: Tried to use not-found external program in "command" I think we need to require sed when dtrace or selinux is found, not by looking at the return value of the get_option().enabled(). Second patch looks good to me. -- Regards, Nazir Bilal Yavuz Microsoft
Re: add non-option reordering to in-tree getopt_long
On Mon, Dec 18, 2023 at 09:31:54PM -0500, Tom Lane wrote: > Agreed, if it actually is 19 years old. I'm wondering a little bit > if there could be some moderately-recent glibc behavior change > involved. I'm not excited enough about it to go trawl their change > log, but we should keep our ears cocked for similar reports. >From a brief glance, I believe this is long-standing behavior. Even though we advance optind at the bottom of the loop, the next getopt_long() call seems to reset it to the first non-option (which was saved in a previous call). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add --check option to pgindent
On Mon, 18 Dec 2023 at 22:18, Tristan Partin wrote: > Here is an additional patch which implements the behavior you described. > The first patch is just Daniel's squashed version of my patches. I think we'd still want the early exit behaviour when only --check is provided. No need to spend time checking more files if you're not doing anything else. On Mon, 18 Dec 2023 at 22:34, Tristan Partin wrote: > To me, the two options seem at odds, like you either check or write. How > would you feel about just capturing the diffs that are printed and > patch(1)-ing them? I tried capturing the diffs and patching them, but simply piping the pgindent output to patch(1) didn't work because the pipe loses the exit code of pgindent. -o pipefail would help with this, but it's not available in all shells. Also then it's suddenly unclear if pgident failed or if patch failed. Attached are two additional patches (in addition to your unchanged previous ones). One which adds the --write flag and one which early exits with --check when neither --write or --diff are provided. The additional code I added for the --write flag is really minimal IMO. On Tue, 19 Dec 2023 at 14:51, Andrew Dunstan wrote: > Not sold on this. I don't want pgindent applied automatically to my > uncommitted changes, but I do want a reminder that I forgot to run it. > In any case, as you say it's a different topic. To be clear, with this patch just passing --check won't apply the changes automatically. Only when passing both --write and --check will it write the files. To clarify my commit workflow is as follows: git add -p # interactively select all the changes that I want in my commit git commit # pre-commit hook fails because of pgindent and "fixes" my files in place but doesn't add them to the commit yet git add -p # I look at all the changes that pgindent make to see if they are sensible and accept them, if not I find some other way to fix them git commit # now commit succeeded On Tue, 19 Dec 2023 at 14:58, Daniel Gustafsson wrote: > I think we risk making the tool confusing if we add too many options which can > all be combined to suit every need. The posted v5 seems like a good > compromise > I reckon. I personally think these options are completely independent. So it's more confusing to me that they cannot be combined. I updated the help message in 0003 as well, to describe them as completely independent: --diff show the changes that need to be made --check exit with status 2 if any changes need to be made --write rewrites files that need changes (default if neither --check/--diff/--write are provided) PS. prettier (javascript formatter) allows both --check and --write at the same time to do exactly this PPS. the help message didn't contain anything about pgindent writing files by default (i.e. when --check or --diff are not provided) PPPS. I attached my new pre-commit hook for reference. v6-0002-Print-all-diffs-with-pgindent-check.patch Description: Binary data v6-0001-Rename-non-destructive-modes-in-pgindent.patch Description: Binary data v6-0003-Add-write-flag-to-pgindent.patch Description: Binary data v6-0004-Early-exit-with-pgindent-check-when-possible.patch Description: Binary data pre-commit Description: Binary data
Re: backtrace_on_internal_error
Robert Haas writes: > The last change we made in this area that, at least for me, massively > improved debuggability was the change to log the current query string > when a backend crashes. That's such a huge help; I can't imagine going > back to the old way where you had basically no idea what made things > go boom. I think doing something like this can have a similarly > positive impact. It is going to take some work - from us and from > extension authors - to tidy things up so that it doesn't produce a > bunch of unwanted output, but the payoff will be the ability to > actually find and fix the bugs instead of just saying to a customer > "hey, sucks that you hit a bug, let us know if you find a reproducer." IMO, we aren't really going to get a massive payoff from this with the current backtrace output; it's just not detailed enough. It's better than nothing certainly, but to really move the goalposts we'd need something approaching gdb's "bt full" output. I wonder if it'd be sane to try to auto-invoke gdb. That's just blue sky for now, though. In the meantime, I agree with the proposal as it stands (that is, auto-backtrace on any XX000 error). We'll soon find out whether it's useless, or needs more detail to be really helpful, or is just right as it is. Once we have some practical experience with it, we can course-correct as needed. regards, tom lane
Re: [DOC] Introducing Quick Start Guide to PL/pgSQL and PL/Python Documentation
Subject: Clarification on the Purpose of the Patch Hi Peter, The intention was to address the challenge faced by newcomers in understanding how to write an extension for PostgreSQL. The existing documentation, while comprehensive, lacks a consolidated and easy-to-follow tutorial that serves as a quick start guide. The goal was to create a beginner-friendly resource that assumes only knowledge of Postgres and the target language, making it accessible for new contributors because the barrier for entry is prohibitive for new contributors. There are various third-party blog posts focusing on different areas, and sometimes contradictory. Specifically: 1. The new section titled "Quick Start Guide" aims to provide step-by-step instructions for users to get started with writing extensions in PL/pgSQL and PL/Python. 2. Explanations, code snippets, and examples were included to illustrate key concepts, making it easier for users to follow along. I understand your point about the basics of setting up an extension file structure being repeated. The intention was to provide a self-contained guide within each language's documentation, ensuring that users who directly access a specific language's documentation get a complete guide without having to navigate between sections. If there are areas where the existing documentation is already sufficient or if there are ways to improve the approach, I am open to suggestions. The primary aim is to enhance the accessibility of extension development information for newcomers. -- Best regards, Ishaan Adarsh On Tue, Dec 19, 2023 at 9:18 PM Peter Eisentraut wrote: > On 16.12.23 11:49, Ishaan Adarsh wrote: > > 1. Added a new section titled "Quick Start Guide" to both PL/pgSQL and > > PL/Python documentation. > > 2. Included step-by-step instructions for users to get started with > > these procedural languages. > > 3. Provided explanations, code snippets, and examples to illustrate key > > concepts. > > The way I read it, that's not really what your patch does. Your patch > explains how to write an extension in PL/pgSQL and PL/Python, > respectively. Which is okay, I guess, but a bit unusual. But I > wouldn't call that an unqualified "quick start" in the respective > languages. Also, it seems to repeat the very basics of setting up an > extension file structure etc. repeatedly in each chapter. > > The existing documentation has "explanations, code snippets, and > examples". Are they not good? Do you have more? Better ones? Why are > yours separate from the existing ones? > > I think it would be useful to take a step back here and define the > purpose of a bit clearer. >
Re: introduce dynamic shared memory registry
On Fri, Dec 08, 2023 at 04:36:52PM +0900, Michael Paquier wrote: > Yes, tracking that in a more central way can have many usages, so your > patch sounds like a good idea. Note that we have one case in core > that be improved and make use of what you have here: autoprewarm.c. I'll add a patch for autoprewarm.c. Even if we don't proceed with that change, it'll be a good demonstration. > +dsm_registry_init_or_attach(const char *key, void **ptr, size_t size, > + void (*init_callback) (void *ptr)) > > This is shaped around dshash_find_or_insert(), but it looks like you'd > want an equivalent for dshash_find(), as well. What is the use-case for only verifying the existence of a segment? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: introduce dynamic shared memory registry
On Tue, Dec 19, 2023 at 10:49:23AM -0500, Robert Haas wrote: > On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov > wrote: >> 2. I think a separate file for this feature looks too expensive. >> According to the gist of that code, it is a part of the DSA module. > > -1. I think this is a totally different thing than DSA. More files > aren't nearly as expensive as the confusion that comes from smushing > unrelated things together. Agreed. I think there's a decent chance that more functionality will be added to this registry down the line, in which case it will be even more important that this stuff stays separate from the tools it is built with. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: backtrace_on_internal_error
On Fri, Dec 8, 2023 at 1:34 PM Andres Freund wrote: > Oh, very much agreed. But I suspect we won't quickly do the same for > out-of-core extensions... I feel like this is a problem that will sort itself out just fine. The rules about using ereport() and elog() could probably be better documented than they are, but doing that won't cause people to follow the practices any more rigorously than they have been. However, a change like this just might. If we make this policy change in core, then extension authors will probably get pressure from users to clean up any calls that are emitting excessively verbose log output, and that seems like a good thing. It's impossible to make an omelet without breaking some eggs, but the thing we're talking about here is, IMHO, extremely important. Users are forever hitting weird errors in production that aren't easy to reproduce on test systems, and because most elog() calls are written with the expectation that they won't be hit, they often contain minimal information, which IME makes it really difficult to understand what went wrong. A lot of these are things like - oh, this function expected a valid value of some sort, say a relkind, and it got some nonsense value, say a zero byte. But where did that nonsense value originate? That elog message can't tell you that, but a stack trace will. The last change we made in this area that, at least for me, massively improved debuggability was the change to log the current query string when a backend crashes. That's such a huge help; I can't imagine going back to the old way where you had basically no idea what made things go boom. I think doing something like this can have a similarly positive impact. It is going to take some work - from us and from extension authors - to tidy things up so that it doesn't produce a bunch of unwanted output, but the payoff will be the ability to actually find and fix the bugs instead of just saying to a customer "hey, sucks that you hit a bug, let us know if you find a reproducer." -- Robert Haas EDB: http://www.enterprisedb.com
Re: Remove MSVC scripts from the tree
On Tue Dec 19, 2023 at 9:24 AM CST, Peter Eisentraut wrote: On 18.12.23 14:52, Peter Eisentraut wrote: >> 2) I had seen that if sed/gzip is not available meson build will fail: >> 2.a) >> Program gsed sed found: NO >> meson.build:334:6: ERROR: Program 'gsed sed' not found or not executable > > Yes, this would need to be improved. Currently, sed is only required if > either selinux or dtrace is enabled, which isn't supported on Windows. > But we should adjust the build scripts to not fail the top-level setup > run unless those options are enabled. > >> 2.b) >> Program gzip found: NO >> meson.build:337:7: ERROR: Program 'gzip' not found or not executable > > gzip is only required for certain test suites, so again we should adjust > the build scripts to not fail the build but instead skip the tests as > appropriate. Here are patches for these two issues. More testing would be appreciated. Meson looks good to me! -- Tristan Partin Neon (https://neon.tech)
Re: introduce dynamic shared memory registry
On Mon, Dec 18, 2023 at 12:05:28PM +0300, Nikita Malakhov wrote: > Just a suggestion - maybe it is worth adding a function for detaching the > segment, > for cases when we unload and/or re-load the extension? Hm. We don't presently have a good way to unload a library, but you can certainly DROP EXTENSION, in which case you might expect the segment to go away or at least be reset. But even today, once a preloaded library is loaded, it stays loaded and its shared memory remains regardless of whether you CREATE/DROP extension. Can you think of problems with keeping the segment attached? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: meson: Stop using deprecated way getting path of files
On Mon Dec 18, 2023 at 12:43 AM CST, John Naylor wrote: On Tue, Dec 5, 2023 at 3:27 AM Tristan Partin wrote: > > On Mon Dec 4, 2023 at 2:10 PM CST, Tom Lane wrote: > > Not sure what you were using, but are you aware that SQL access to the > > buildfarm database is available to project members? My own stock > > approach to checking on this sort of thing is like > > > > select * from > > (select sysname, snapshot, unnest(string_to_array(log_text, E'\n')) as l > > from build_status_log join snapshots using (sysname, snapshot) > > where log_stage = 'configure.log') ss > > where l like 'checking for builtin %' > > This looks useful. I had no idea about this. Can you send me any > resources for setting this up? My idea was just to do some web scraping. +1 -- I was vaguely aware of this, but can't find any mention of specifics in the buildfarm how-to, or other places I thought to look. From my off-list conversations with Andrew, database access to the buildfarm is for trusted contributors. I do not meet current criteria. I've thought about building a web-scraper to get at some of this information for non-trusted contributors. If that interests you, let me know, and maybe I can build it out over the holiday. Or maybe you meet the criteria! :) -- Tristan Partin Neon (https://neon.tech)
Re: introduce dynamic shared memory registry
On Mon, Dec 18, 2023 at 03:32:08PM +0700, Andrei Lepikhov wrote: > 3. The dsm_registry_init_or_attach routine allocates a DSM segment. Why not > create dsa_area for a requestor and return it? My assumption is that most modules just need a fixed-size segment, and if they really needed a DSA segment, the handle, tranche ID, etc. could just be stored in the DSM segment. Maybe that assumption is wrong, though... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add new for_each macros for iterating over a List that do not require ListCell pointer
On Tue, Dec 19, 2023 at 03:44:43PM +0100, Jelte Fennema-Nio wrote: > On Tue, 19 Dec 2023 at 11:59, vignesh C wrote: >> I noticed that this change can be done in several other places too. > > My guess would be that ~90% of all existing foreach loops in the > codebase can be easily rewritten (and simplified) using these new > macros. So converting all of those would likely be quite a bit of > work. In patch 0003 I only converted a few of them to get some > coverage of the new macros and show how much simpler the usage of them > is. I'm not sure we should proceed with rewriting most/all eligible foreach loops. I think it's fine to use the new macros in new code or to update existing loops in passing when changing nearby code, but rewriting everything likely just introduces back-patching pain in return for little discernible gain. > And even once these patches are merged to master, I think we should > only do any bulk changes if/when we backport these macros to all > supported PG versions. Backporting to PG12 is probably the hardest, > since List its internal layout got heavily changed in PG13. Probably > not too hard though, in Citus we've had similar macros work since > PG11. I'm also not sure what the policy is for backporting patches > that introduce new functions/macros in public headers. Unless there's some way to argue this is a bug, security issue, or data corruption problem [0], I seriously doubt we will back-patch this. [0] https://www.postgresql.org/support/versioning/ -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: introduce dynamic shared memory registry
On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov wrote: > 2. I think a separate file for this feature looks too expensive. > According to the gist of that code, it is a part of the DSA module. -1. I think this is a totally different thing than DSA. More files aren't nearly as expensive as the confusion that comes from smushing unrelated things together. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [DOC] Introducing Quick Start Guide to PL/pgSQL and PL/Python Documentation
On 16.12.23 11:49, Ishaan Adarsh wrote: 1. Added a new section titled "Quick Start Guide" to both PL/pgSQL and PL/Python documentation. 2. Included step-by-step instructions for users to get started with these procedural languages. 3. Provided explanations, code snippets, and examples to illustrate key concepts. The way I read it, that's not really what your patch does. Your patch explains how to write an extension in PL/pgSQL and PL/Python, respectively. Which is okay, I guess, but a bit unusual. But I wouldn't call that an unqualified "quick start" in the respective languages. Also, it seems to repeat the very basics of setting up an extension file structure etc. repeatedly in each chapter. The existing documentation has "explanations, code snippets, and examples". Are they not good? Do you have more? Better ones? Why are yours separate from the existing ones? I think it would be useful to take a step back here and define the purpose of a bit clearer.
Re: Add --check option to pgindent
On Mon Dec 18, 2023 at 3:18 PM CST, Tristan Partin wrote: On Mon Dec 18, 2023 at 10:50 AM CST, Tristan Partin wrote: > On Mon Dec 18, 2023 at 10:14 AM CST, Jelte Fennema-Nio wrote: > > On Mon, 18 Dec 2023 at 13:42, Daniel Gustafsson wrote: > > > I think this is pretty much ready to go, the attached v4 squashes the changes > > > and fixes the man-page which also needed an update. The referenced Wiki page > > > will need an edit or two after this goes in, but that's easy enough. > > > > One thing I'm wondering: When both --check and --diff are passed, > > should pgindent still early exit with 2 on the first incorrectly > > formatted file? Or should it show diffs for all failing files? I'm > > leaning towards the latter making more sense. > > Makes sense. Let me iterate on the squashed version of the patch. Here is an additional patch which implements the behavior you described. The first patch is just Daniel's squashed version of my patches. Assuming all this is good, I now have access to edit the Wiki. The PR for buildfarm client code is up, and hopefully that PR is correct. -- Tristan Partin Neon (https://neon.tech)
Re: Remove MSVC scripts from the tree
On 18.12.23 14:52, Peter Eisentraut wrote: 2) I had seen that if sed/gzip is not available meson build will fail: 2.a) Program gsed sed found: NO meson.build:334:6: ERROR: Program 'gsed sed' not found or not executable Yes, this would need to be improved. Currently, sed is only required if either selinux or dtrace is enabled, which isn't supported on Windows. But we should adjust the build scripts to not fail the top-level setup run unless those options are enabled. 2.b) Program gzip found: NO meson.build:337:7: ERROR: Program 'gzip' not found or not executable gzip is only required for certain test suites, so again we should adjust the build scripts to not fail the build but instead skip the tests as appropriate. Here are patches for these two issues. More testing would be appreciated.From 8c51d31676b8a5b9ee17ab3a04be142076f2a48f Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Tue, 19 Dec 2023 16:05:54 +0100 Subject: [PATCH 1/2] meson: Require sed only when needed sed is required only if dtrace or selinux is enabled. Otherwise, we don't need it. This avoids making sed a hard requirement on Windows. --- meson.build | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/meson.build b/meson.build index d3771690f0..b8c7ed8bf2 100644 --- a/meson.build +++ b/meson.build @@ -331,7 +331,8 @@ perl = find_program(get_option('PERL'), required: true, native: true) python = find_program(get_option('PYTHON'), required: true, native: true) flex = find_program(get_option('FLEX'), native: true, version: '>= 2.5.35') bison = find_program(get_option('BISON'), native: true, version: '>= 2.3') -sed = find_program(get_option('SED'), 'sed', native: true) +sed = find_program(get_option('SED'), 'sed', native: true, + required: get_option('dtrace').enabled() or get_option('selinux').enabled()) prove = find_program(get_option('PROVE'), native: true, required: false) tar = find_program(get_option('TAR'), native: true) gzip = find_program(get_option('GZIP'), native: true) -- 2.43.0 From d95033439f60ea19c5a316f658f0016b6691a232 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Tue, 19 Dec 2023 16:20:41 +0100 Subject: [PATCH 2/2] meson: Make gzip and tar optional They are only used for some tests. The tests are already set to skip as appropriate if they are not available. --- contrib/basebackup_to_shell/meson.build | 4 ++-- meson.build | 4 ++-- src/bin/pg_basebackup/meson.build | 4 ++-- src/bin/pg_dump/meson.build | 2 +- src/bin/pg_verifybackup/meson.build | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/contrib/basebackup_to_shell/meson.build b/contrib/basebackup_to_shell/meson.build index a5488c3023..331ee1c9be 100644 --- a/contrib/basebackup_to_shell/meson.build +++ b/contrib/basebackup_to_shell/meson.build @@ -24,7 +24,7 @@ tests += { 'tests': [ 't/001_basic.pl', ], -'env': {'GZIP_PROGRAM': gzip.path(), -'TAR': tar.path()}, +'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', +'TAR': tar.found() ? tar.path() : '' }, }, } diff --git a/meson.build b/meson.build index b8c7ed8bf2..c8a2630db8 100644 --- a/meson.build +++ b/meson.build @@ -334,8 +334,8 @@ bison = find_program(get_option('BISON'), native: true, version: '>= 2.3') sed = find_program(get_option('SED'), 'sed', native: true, required: get_option('dtrace').enabled() or get_option('selinux').enabled()) prove = find_program(get_option('PROVE'), native: true, required: false) -tar = find_program(get_option('TAR'), native: true) -gzip = find_program(get_option('GZIP'), native: true) +tar = find_program(get_option('TAR'), native: true, required: false) +gzip = find_program(get_option('GZIP'), native: true, required: false) program_lz4 = find_program(get_option('LZ4'), native: true, required: false) openssl = find_program(get_option('OPENSSL'), native: true, required: false) program_zstd = find_program(get_option('ZSTD'), native: true, required: false) diff --git a/src/bin/pg_basebackup/meson.build b/src/bin/pg_basebackup/meson.build index c426173db3..5445903a5b 100644 --- a/src/bin/pg_basebackup/meson.build +++ b/src/bin/pg_basebackup/meson.build @@ -80,8 +80,8 @@ tests += { 'sd': meson.current_source_dir(), 'bd': meson.current_build_dir(), 'tap': { -'env': {'GZIP_PROGRAM': gzip.path(), -'TAR': tar.path(), +'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', +'TAR': tar.found() ? tar.path() : '', 'LZ4': program_lz4.found() ? program_lz4.path() : '', }, 'tests': [ diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build index b6603e26a5..77d162cad4 100644 --- a/src/bin/pg_dump/meson.build +++ b/src/bin/pg_dump/meson.build @@ -90,7 +90,7 @@ tests += { 'bd': meson.current_build_dir(), 'tap': { 'env': { - 'GZIP_PROGRAM': gzip.path(), + 'GZIP_P
Re: Allow custom parameters with more than one dot in config files.
Alexander Kukushkin writes: > At the moment the only requirement for custom parameter names is that they > should have one or more dots. > ... > But, Postgres fails to start if such parameters are set in the > configuration file with the following error: Hmm. > In my opinion it would be fair to make parsing of config files with the > rest of the code responsible for GUC handling by allowing custom parameters > containing more than one dot. I wonder if we wouldn't be better advised to require exactly one dot. This isn't a feature that we really encourage users to use, and the further we move the goalposts for it, the harder it will be to replace it. In particular I doubt the long-stalled session-variables patch could support such names, since it needs the names to conform to normal SQL rules. regards, tom lane
Re: pg_waldump
Ok thanks for all these precisions Regards Fabrice On Tue, Dec 19, 2023 at 2:00 PM Matthias van de Meent < boekewurm+postg...@gmail.com> wrote: > On Tue, 19 Dec 2023, 12:27 Fabrice Chapuis, > wrote: > > > > Hi, > > Is it possible to visualize the DDL with the pg_waldump tool. I created > a postgres user but I cannot find the creation command in the wals > > Not really, no. PostgreSQL does not log DDL or DML as such in WAL. > Essentially all catalog updates are logged only as changes on a > certain page in some file: a new user getting inserted would be > approximately "Insert tuple [user's pg_role row data] on page X in > file [the file corresponding to the pg_role table]". > > You could likely derive most DDL commands from Heap/Insert, > Heap/Delete, and Heap/Update records (after cross-referencing the > database's relfilemap), as most DDL is "just" a lot of in-memory > operations plus some record insertions/updates/deletes in catalog > tables. You'd also need to keep track of any relfilemap changes while > processing the WAL, as VACUUM FULL on the catalog tables would change > the file numbering of catalog tables... > > Kind regards, > > Matthias van de Meent > Neon (https://neon.tech) >
Re: Transaction timeout
On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin wrote: > > > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin wrote: > > > > I don’t have Windows machine, so I hope CF bot will pick this. > > I used Github CI to produce version of tests that seems to be is stable on > Windows. > Sorry for the noise. > > > Best regards, Andrey Borodin. + +If transaction_timeout is shorter than +idle_in_transaction_session_timeout or statement_timeout +transaction_timeout will invalidate longer timeout. + When transaction_timeout is *equal* to idle_in_transaction_session_timeout or statement_timeout, idle_in_transaction_session_timeout and statement_timeout will also be invalidated, the logic in the code seems right, though this document is a little bit inaccurate. -- Regards Junwang Zhao
Re: Update the comment in nodes.h to cover Cardinality
On 19.12.23 07:23, Richard Guo wrote: By chance I discovered that the comment for the typedefs of "double"s does not cover Cardinality. Should we update that comment accordingly, maybe something like below? - * Typedefs for identifying qualifier selectivities and plan costs as such. - * These are just plain "double"s, but declaring a variable as Selectivity - * or Cost makes the intent more obvious. + * Typedefs for identifying qualifier selectivities, plan costs and + * estimated rows or other count as such. These are just plain "double"s, + * but declaring a variable as Selectivity, Cost or Cardinality makes the + * intent more obvious. Fixed, thanks.
Re: Add new for_each macros for iterating over a List that do not require ListCell pointer
On Tue, 19 Dec 2023 at 11:59, vignesh C wrote: > I noticed that this change can be done in several other places too. My guess would be that ~90% of all existing foreach loops in the codebase can be easily rewritten (and simplified) using these new macros. So converting all of those would likely be quite a bit of work. In patch 0003 I only converted a few of them to get some coverage of the new macros and show how much simpler the usage of them is. > Should we start doing these changes too now? I think we should at least wait until this patchset is merged before we start changing other places. If there's some feedback on the macros and decide we change how they get called, then it would be a waste of time to have to change all the call sites. And even once these patches are merged to master, I think we should only do any bulk changes if/when we backport these macros to all supported PG versions. Backporting to PG12 is probably the hardest, since List its internal layout got heavily changed in PG13. Probably not too hard though, in Citus we've had similar macros work since PG11. I'm also not sure what the policy is for backporting patches that introduce new functions/macros in public headers. We probably even want to consider some automatic rewriting script (for the obvious cases) and/or timing the merge, to avoid having to do many rebases of the patch.
Re: [DOC] Introducing Quick Start Guide to PL/pgSQL and PL/Python Documentation
On Tue, 19 Dec 2023 at 17:57, Ishaan Adarsh wrote: > I've addressed the points you raised: > > 1. *Missing `` Tag:* > I reviewed the "Create the Makefile" section, and it seems that > tags are appropriately used for filenames. If there's a specific instance > where you observed a missing tag, please provide more details, and I'll > ensure it's addressed. > Thanks. > 2. *Use `CREATE EXTENSION` in "extension--version.sql":* > Considering that there's already a CREATE EXTENSION step in the quick > start guide, I can include a note in the general documentation to explain > the rationale without repeating it in the individual script. What do you > think? Agreed. -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Re: Transaction timeout
On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin wrote: >> On 19 Dec 2023, at 13:26, Andrey M. Borodin wrote: >> >> I don’t have Windows machine, so I hope CF bot will pick this. > > I used Github CI to produce version of tests that seems to be is stable on > Windows. It still failed on Windows Server 2019 [1]. diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out --- C:/cirrus/src/test/isolation/expected/timeouts.out 2023-12-19 10:34:30.354721100 + +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out 2023-12-19 10:38:25.877981600 + @@ -100,7 +100,7 @@ step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2' count - -0 +1 (1 row) step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout = '10s'; SET transaction_timeout = '10s'; [1] https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Re: Add --check option to pgindent
> On 19 Dec 2023, at 14:51, Andrew Dunstan wrote: > > On 2023-12-18 Mo 11:14, Jelte Fennema-Nio wrote: >> On Mon, 18 Dec 2023 at 13:42, Daniel Gustafsson wrote: >>> I think this is pretty much ready to go, the attached v4 squashes the >>> changes >>> and fixes the man-page which also needed an update. The referenced Wiki >>> page >>> will need an edit or two after this goes in, but that's easy enough. >> One thing I'm wondering: When both --check and --diff are passed, >> should pgindent still early exit with 2 on the first incorrectly >> formatted file? Or should it show diffs for all failing files? I'm >> leaning towards the latter making more sense. > > It should show them all. Agreed. >> Related (but not required for this patch): For my pre-commit hook I >> would very much like it if there was an option to have pgindent write >> the changes to disk, but still exit non-zero, e.g. a --write flag that >> could be combined with --check just like --diff and --check can be >> combined with this patch. Currently my pre-commit hook need two >> separate calls to pgindent to both abort the commit and write the >> required changes to disk. > > Not sold on this. I don't want pgindent applied automatically to my > uncommitted changes, but I do want a reminder that I forgot to run it. In any > case, as you say it's a different topic. I think we risk making the tool confusing if we add too many options which can all be combined to suit every need. The posted v5 seems like a good compromise I reckon. Andrew: When applying this, how do we synchronize with the buildfarm to avoid false negatives due to the BF using the wrong options? -- Daniel Gustafsson
Re: Add --check option to pgindent
On 2023-12-18 Mo 11:14, Jelte Fennema-Nio wrote: On Mon, 18 Dec 2023 at 13:42, Daniel Gustafsson wrote: I think this is pretty much ready to go, the attached v4 squashes the changes and fixes the man-page which also needed an update. The referenced Wiki page will need an edit or two after this goes in, but that's easy enough. One thing I'm wondering: When both --check and --diff are passed, should pgindent still early exit with 2 on the first incorrectly formatted file? Or should it show diffs for all failing files? I'm leaning towards the latter making more sense. It should show them all. Related (but not required for this patch): For my pre-commit hook I would very much like it if there was an option to have pgindent write the changes to disk, but still exit non-zero, e.g. a --write flag that could be combined with --check just like --diff and --check can be combined with this patch. Currently my pre-commit hook need two separate calls to pgindent to both abort the commit and write the required changes to disk. Not sold on this. I don't want pgindent applied automatically to my uncommitted changes, but I do want a reminder that I forgot to run it. In any case, as you say it's a different topic. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Correct the documentation of extension building infrastructure
Hi hackers, I noticed that the documentation of extension building infrastructure [1] has a paragraph about isolation tests since v12, but they actually can be run since v14 only. Actual isolation tests support for extensions built with PGXS was added in [2] and it was agreed there that it should not be backpatched. How about fixing the documentation for v12 and v13 then? [1]: https://www.postgresql.org/docs/12/extend-pgxs.html [2]: https://www.postgresql.org/message-id/flat/CAMsr%2BYFsCMH3B4uOPFE%2B2qWM6k%3Do%3Dhf9LGiPNCfwqKdUPz_BsQ%40mail.gmail.com Best regards, Karina Litskevich Postgres Professional: http://postgrespro.com/
Re: Synchronizing slots from primary to standby
On Tue, Dec 19, 2023 at 5:17 PM shveta malik wrote: > > On Mon, Dec 18, 2023 at 4:22 PM Amit Kapila wrote: > > > > On Fri, Dec 15, 2023 at 11:03 AM shveta malik > > wrote: > > > > > > Sorry, I missed attaching the patch. PFA v48. > > > > > > > Few comments on v48_0002 > > > > Thanks for reviewing. These are addressed in v50. > I was still reviewing the v48 version and have a few comments as below. If some of these are already addressed or not relevant, feel free to ignore them. 1. + /* + * Slot sync worker can be stopped at any time. + * Use exit status 1 so the background worker is restarted. We don't need to start the second line of comment in a separate line. 2. + * The assumption is that these dropped local invalidated slots will get + * recreated in next sync-cycle and it is okay to drop and recreate such slots In the above line '.. local invalidated ..' sounds redundant. Shall we remove it? 3. + if (remote_slot->confirmed_lsn > WalRcv->latestWalEnd) + { + SpinLockRelease(&WalRcv->mutex); + elog(ERROR, "skipping sync of slot \"%s\" as the received slot sync " This error message looks odd to me. At least, it should be exiting instead of skipping because we won't continue after this. 4. + /* User created slot with the same name exists, raise ERROR. */ + if (sync_state == SYNCSLOT_STATE_NONE) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("skipping sync of slot \"%s\" as it is a user created" + " slot", remote_slot->name), + errdetail("This slot has failover enabled on the primary and" +" thus is sync candidate but user created slot with" +" the same name already exists on the standby"))); Same problem as above. The skipping in error message doesn't seem to be suitable for the purpose. Additionally, errdetail message should end with a full stop. 5. + /* Slot ready for sync, so sync it. */ + if (sync_state == SYNCSLOT_STATE_READY) + { + /* + * Sanity check: With hot_standby_feedback enabled and + * invalidations handled appropriately as above, this should never + * happen. + */ + if (remote_slot->restart_lsn < slot->data.restart_lsn) + elog(ERROR, + "not synchronizing local slot \"%s\" LSN(%X/%X)" + " to remote slot's LSN(%X/%X) as synchronization " The start of the error message sounds odd. Shall we say 'cannot synchronize ...'? 6. All except one of the callers of local_slot_update() marks the slot dirty and the same is required as well. I think the remaining caller should also mark it dirty and we should move ReplicationSlotMarkDirty() in the caller space. 7. +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot) { ... + /* + * Copy the invalidation cause from remote only if local slot is not + * invalidated locally, we don't want to overwrite existing one. + */ + if (slot->data.invalidated == RS_INVAL_NONE) + { + SpinLockAcquire(&slot->mutex); + slot->data.invalidated = remote_slot->invalidated; + SpinLockRelease(&slot->mutex); + } ... It doesn't seem that after changing the invalidated flag, we always mark the slot dirty. Am, I missing something? 8. + /* + * Drop local slots that no longer need to be synced. Do it before + * synchronize_one_slot to allow dropping of slots before actual sync + * which are invalidated locally while still valid on the primary server. + */ + drop_obsolete_slots(remote_slot_list); The second part of the above comment seems redundant as that is obvious. 9. +static WalReceiverConn * +remote_connect(void) +{ + WalReceiverConn *wrconn = NULL; + char*err; + + wrconn = walrcv_connect(PrimaryConnInfo, true, false, + cluster_name[0] ? cluster_name : "slotsyncworker", + &err); + if (wrconn == NULL) + ereport(ERROR, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not connect to the primary server: %s", err))); + return wrconn; +} Do we need a function for this? It appears to be called from just one place, so not sure if it is helpful to have a function for this. -- With Regards, Amit Kapila.
RE: Synchronizing slots from primary to standby
Dear Shveta, I resumed to review the patch. I will play more about it, but I can post some cosmetic comments. walsender.c 01. WalSndWaitForStandbyConfirmation ``` +sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp()); ``` It works well, but I'm not sure whether we should use WalSndComputeSleeptime() because the function won't be called by walsender. 02.WalSndWaitForStandbyConfirmation ``` +ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, sleeptime, + WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION) ``` Hmm, is it OK to use the same event as WalSndWaitForWal()? IIUC it should be avoided. 03. WalSndShmemInit() ``` + +ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv); ``` Unnecessary blank? ~ 050_standby_failover_slots_sync.pl 04. General My pgperltidy modified your test. Please check. 05. ``` # Create publication on the primary ``` Missing "a" before publication? 06. ``` $subscriber1->init(allows_streaming => 'logical'); ... $subscriber2->init(allows_streaming => 'logical'); ``` IIUC, these settings are not needed. 07. ``` my $primary_insert_time = time(); ``` The variable is not used. 08. ``` # Stop the standby associated with the specified physical replication slot so # that the logical replication slot won't receive changes until the standby # slot's restart_lsn is advanced or the slot is removed from the # standby_slot_names list ``` Missing comma? 09. ``` $back_q->query_until(qr//, "SELECT pg_logical_slot_get_changes('test_slot', NULL, NULL);\n"); ``` Not sure, should we have to close the back_q connection? 10. ``` # Remove the standby from the standby_slot_names list and reload the # configuration $primary->adjust_conf('postgresql.conf', 'standby_slot_names', "''"); $primary->psql('postgres', "SELECT pg_reload_conf()"); ``` a. Missing comma? b. I counted and reload function in perl (e.g., `$primary->reload;`) is more often to be used. Do you have a reason to use pg_reload_conf()? 11. ``` # Now that the standby lsn has advanced, the primary must send the decoded # changes to the subscription. $publisher->wait_for_catchup('regress_mysub1'); ``` Is the comment correct? I think primary sends data because the GUC is modified. 12. ``` # Put the standby back on the primary_slot_name for the rest of the tests $primary->adjust_conf('postgresql.conf', 'standby_slot_names', 'sb1_slot'); $primary->restart(); ``` Just to confirm - you used restart() here because we must ensure the GUC change is propagated to all backends, right? ~ wait_event_names.txt 13. ``` +WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION"Waiting for the WAL to be received by physical standby in WAL sender process." ``` But there is a possibility that backend processes may wait with the event, right? Best Regards, Hayato Kuroda FUJITSU LIMITED
Re: pg_waldump
On Tue, 19 Dec 2023, 12:27 Fabrice Chapuis, wrote: > > Hi, > Is it possible to visualize the DDL with the pg_waldump tool. I created a > postgres user but I cannot find the creation command in the wals Not really, no. PostgreSQL does not log DDL or DML as such in WAL. Essentially all catalog updates are logged only as changes on a certain page in some file: a new user getting inserted would be approximately "Insert tuple [user's pg_role row data] on page X in file [the file corresponding to the pg_role table]". You could likely derive most DDL commands from Heap/Insert, Heap/Delete, and Heap/Update records (after cross-referencing the database's relfilemap), as most DDL is "just" a lot of in-memory operations plus some record insertions/updates/deletes in catalog tables. You'd also need to keep track of any relfilemap changes while processing the WAL, as VACUUM FULL on the catalog tables would change the file numbering of catalog tables... Kind regards, Matthias van de Meent Neon (https://neon.tech)
Re: logical decoding and replication of sequences, take 2
Hi, I wanted to hop in here on one particular issue: > On Dec 12, 2023, at 02:01, Tomas Vondra wrote: > - desirability of the feature: Random IDs (UUIDs etc.) are likely a much > better solution for distributed (esp. active-active) systems. But there > are important use cases that are likely to keep using regular sequences > (online upgrades of single-node instances, existing systems, ...). +1. Right now, the lack of sequence replication is a rather large foot-gun on logical replication upgrades. Copying the sequences over during the cutover period is doable, of course, but: (a) There's no out-of-the-box tooling that does it, so everyone has to write some scripts just for that one function. (b) It's one more thing that extends the cutover window. I don't think it is a good idea to make it mandatory: for example, there's a strong use case for replicating a table but not a sequence associated with it. But it's definitely a missing feature in logical replication.
Allow custom parameters with more than one dot in config files.
Hi hacker, At the moment the only requirement for custom parameter names is that they should have one or more dots. For example: test5=# set a.b.c.d = 1; SET test5=# show a.b.c.d; a.b.c.d - 1 (1 row) But, Postgres fails to start if such parameters are set in the configuration file with the following error: LOG: syntax error in file "/tmp/cluster/postgresql.auto.conf" line 8, near token "a.b.c.d" FATAL: configuration file "postgresql.auto.conf" contains errors What is more fun, ALTER SYSTEM allows writing such parameters to the postgresql.auto.conf file if they are defined in a session: test5=# show a.b.c.d; ERROR: unrecognized configuration parameter "a.b.c.d" test5=# alter system set a.b.c.d = 1; ERROR: unrecognized configuration parameter "a.b.c.d" test5=# set a.b.c.d = 1; SET test5=# alter system set a.b.c.d = 1; ALTER SYSTEM In my opinion it would be fair to make parsing of config files with the rest of the code responsible for GUC handling by allowing custom parameters containing more than one dot. The fix is rather simple, please find the patch attached. Regards, -- Alexander Kukushkin From 755b7d9a44901f3dc5f86f83c39c98c269a98c85 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Tue, 19 Dec 2023 13:21:43 +0100 Subject: [PATCH] Allow custom parameters with more than one dot in config files. To make it consistent with the rest of the code responsible for GUC handling. --- src/backend/utils/misc/guc-file.l | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l index 41d62a9f23..22250d12a1 100644 --- a/src/backend/utils/misc/guc-file.l +++ b/src/backend/utils/misc/guc-file.l @@ -82,7 +82,8 @@ LETTER [A-Za-z_\200-\377] LETTER_OR_DIGIT [A-Za-z_0-9\200-\377] ID{LETTER}{LETTER_OR_DIGIT}* -QUALIFIED_ID {ID}"."{ID} +SUB_ID "."{ID} +QUALIFIED_ID {ID}{SUB_ID}+ UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])* STRING \'([^'\\\n]|\\.|\'\')*\' -- 2.34.1
Re:pg_waldump
Hello It is something like rmgr: Heaplen (rec/tot):143/ 143, tx:748, lsn: 0/01530AF0, prev 0/01530AB8, desc: INSERT off: 17, flags: 0x00, blkref #0: rel 1664/0/1260 blk 0 just insertion into pg_authid (oid=1260) regards, Sergei
Re: Synchronizing slots from primary to standby
On Tue, Dec 19, 2023 at 6:58 AM Peter Smith wrote: > > Here are some comments for the patch v49-0002. > Thanks for reviewing. I have addressed these in v50. > (This is in addition to my review comments for v48-0002 [1]) > > == > src/backend/access/transam/xlogrecovery.c > > > 1. FinishWalRecovery > > + * > + * We do not update the sync_state from READY to NONE here, as any failed > + * update could leave some slots in the 'NONE' state, causing issues during > + * slot sync after restarting the server as a standby. While updating after > + * switching to the new timeline is an option, it does not simplify the > + * handling for both READY and NONE state slots. Therefore, we retain the > + * READY state slots after promotion as they can provide useful information > + * about their origin. > + */ > > Do you know if that wording is correct? e.g., If you were updating > from READY to NONE and there was a failed update, that would leave > some slots still in a READY state, right? So why does the comment say > "could leave some slots in the 'NONE' state"? > yes, it the comment is correct as stated in [1] [1]: https://www.postgresql.org/message-id/CAA4eK1LoJSbFJwa%3D97_5qHNAVfOkmfc40W_SFMVBbm6r0%3DPXHQ%40mail.gmail.com > == > src/backend/replication/slot.c > > 2. ReplicationSlotAlter > > + /* > + * Do not allow users to drop the slots which are currently being synced > + * from the primary to the standby. > + */ > + if (RecoveryInProgress() && > + MyReplicationSlot->data.sync_state != SYNCSLOT_STATE_NONE) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("cannot alter replication slot \"%s\"", name), > + errdetail("This slot is being synced from the primary server."))); > + > > The comment looks wrong -- should say "Do not allow users to alter..." > > == > > 3. > +## > +# Test that synchronized slot can neither be decoded nor dropped by the user > +## > + > > 3a, > /Test that synchronized slot/Test that a synchronized slot/ > > 3b. > Isn't there a missing test? Should this part also check that it cannot > ALTER the replication slot being synced? e.g. test for the new v49 > error message that was added in ReplicationSlotAlter() > > ~~~ > > 4. > +# Disable hot_standby_feedback > +$standby1->safe_psql('postgres', 'ALTER SYSTEM SET > hot_standby_feedback = off;'); > +$standby1->restart; > + > > Can there be a comment added to explain why you are doing the > 'hot_standby_feedback' toggle? > > ~~~ > > 5. > +## > +# Promote the standby1 to primary. Confirm that: > +# a) the sync-ready('r') slot 'lsub1_slot' is retained on the new primary > +# b) the initiated('i') slot 'logical_slot' is dropped on promotion > +# c) logical replication for regress_mysub1 is resumed succesfully > after failover > +## > > /succesfully/successfully/ > > ~~~ > > 6. > + > +# Confirm that data in tab_mypub3 replicated on subscriber > +is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}), > + "$primary_row_count", > + 'data replicated from the new primary'); > > The comment is wrong -- it names a different table ('tab_mypub3' ?) to > what the SQL says. > > == > [1] My v48-0002 review comments. > https://www.postgresql.org/message-id/CAHut%2BPsyZQZ1A4XcKw-D%3DvcTg16pN9Dw0PzE8W_X7Yz_bv00rQ%40mail.gmail.com > > Kind Regards, > Peter Smith. > Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Tue, Dec 19, 2023 at 4:51 AM Peter Smith wrote: > > Here are some review comments for v48-0002 > Thanks for reviewing. Most of these are addressed in v50. Please find my comments for the rest. > == > doc/src/sgml/config.sgml > > 1. > + If slot synchronization is enabled then it is also necessary to > + specify dbname in the > + primary_conninfo string. This will only > be used for > + slot synchronization. It is ignored for streaming. > > I felt the "If slot synchronization is enabled" part should also > include an xref to the enable_slotsync GUC, otherwise there is no > information here about how to enable it. > > SUGGESTION > If slot synchronization is enabled (see XXX) > > == > doc/src/sgml/logicaldecoding.sgml > > 2. > + > + The ability to resume logical replication after failover depends upon > the > + linkend="view-pg-replication-slots">pg_replication_slots.sync_state > + value for the synchronized slots on the standby at the time of failover. > + Only slots that have attained "ready" sync_state ('r') on the standby > + before failover can be used for logical replication after failover. > Slots > + that have not yet reached 'r' state (they are still 'i') will be > dropped, > + therefore logical replication for those slots cannot be resumed. For > + example, if the synchronized slot could not become sync-ready on standby > + due to a disabled subscription, then the subscription cannot be resumed > + after failover even when it is enabled. > + If the primary is idle, then the synchronized slots on the standby may > + take a noticeable time to reach the ready ('r') sync_state. This can > + be sped up by calling the > + pg_log_standby_snapshot function on the primary. > + > > 2a. > /sync-ready on standby/sync-ready on the standby/ > > ~ > > 2b. > Should "If the primary is idle" be in a new paragraph? > > == > doc/src/sgml/system-views.sgml > > 3. > + > + The hot standby can have any of these sync_state values for the slots > but > + on a hot standby, the slots with state 'r' and 'i' can neither be used > + for logical decoding nor dropped by the user. > + The sync_state has no meaning on the primary server; the primary > + sync_state value is default 'n' for all slots but may (if leftover > + from a promoted standby) also be 'r'. > + > > I still feel we are exposing too much useless information about the > primary server values. > > Isn't it sufficient to just say "The sync_state values have no meaning > on a primary server.", and not bother to mention what those > meaningless values might be -- e.g. if they are meaningless then who > cares what they are or how they got there? > I am retaining the original info till we find a better place for it as suggested by Amit in [1] > == > src/backend/replication/logical/slotsync.c > > 4. synchronize_one_slot > > + /* Slot ready for sync, so sync it. */ > + if (sync_state == SYNCSLOT_STATE_READY) > + { > + /* > + * Sanity check: With hot_standby_feedback enabled and > + * invalidations handled appropriately as above, this should never > + * happen. > + */ > + if (remote_slot->restart_lsn < slot->data.restart_lsn) > + elog(ERROR, > + "not synchronizing local slot \"%s\" LSN(%X/%X)" > + " to remote slot's LSN(%X/%X) as synchronization " > + " would move it backwards", remote_slot->name, > + LSN_FORMAT_ARGS(slot->data.restart_lsn), > + LSN_FORMAT_ARGS(remote_slot->restart_lsn)); > + > + if (remote_slot->confirmed_lsn != slot->data.confirmed_flush || > + remote_slot->restart_lsn != slot->data.restart_lsn || > + remote_slot->catalog_xmin != slot->data.catalog_xmin) > + { > + /* Update LSN of slot to remote slot's current position */ > + local_slot_update(remote_slot); > + ReplicationSlotSave(); > + slot_updated = true; > + } > + } > + /* Slot not ready yet, let's attempt to make it sync-ready now. */ > + else if (sync_state == SYNCSLOT_STATE_INITIATED) > + { > + /* > + * Wait for the primary server to catch-up. Refer to the comment > + * atop the file for details on this wait. > + */ > + if (remote_slot->restart_lsn < slot->data.restart_lsn || > + TransactionIdPrecedes(remote_slot->catalog_xmin, > + slot->data.catalog_xmin)) > + { > + if (!wait_for_primary_slot_catchup(wrconn, remote_slot, NULL)) > + { > + ReplicationSlotRelease(); > + return false; > + } > + } > + > + /* > + * Wait for primary is over, update the lsns and mark the slot as > + * READY for further syncs. > + */ > + local_slot_update(remote_slot); > + SpinLockAcquire(&slot->mutex); > + slot->data.sync_state = SYNCSLOT_STATE_READY; > + SpinLockRelease(&slot->mutex); > + > + /* Save the changes */ > + ReplicationSlotMarkDirty(); > + ReplicationSlotSave(); > + slot_updated = true; > + > + ereport(LOG, > + errmsg("newly locally created slot \"%s\" is sync-ready now", > +remote_slot->name)); > + } > > 4a. > It would be more natur
Re: Synchronizing slots from primary to standby
On Mon, Dec 18, 2023 at 4:22 PM Amit Kapila wrote: > > On Fri, Dec 15, 2023 at 11:03 AM shveta malik wrote: > > > > Sorry, I missed attaching the patch. PFA v48. > > > > Few comments on v48_0002 > Thanks for reviewing. These are addressed in v50. Please find my comments inline for some of these. > 1. > +static void > +slotsync_reread_config(WalReceiverConn *wrconn) > { > ... > + pfree(old_primary_conninfo); > + pfree(old_primary_slotname); > + > + if (restart) > + { > + char*msg = "slot sync worker will restart because of a parameter > change"; > + > + /* > + * The exit code 1 will make postmaster restart the slot sync worker. > + */ > + slotsync_worker_exit(msg, 1 /* proc_exit code */ ); > + } > ... > > I don't see the need to explicitly pfree in case we are already > exiting the process because anyway the memory will be released. We can > avoid using the 'restart' variable for this. I have moved pfree to the end where we do not exit the worker. Removed restart variable. >Also, probably, directly > exiting here makes sense and at another place where this function is > used. I see that in maybe_reread_subscription(), we exit with a 0 code > and still apply worker restarts, so why use a different exit code > here? > Logical rep worker is started by logical rep launcher and it has different logic of restarting it. OTOH, slot-sync worker is started by the postmaster and the postmaster starts any of its bgworkers only if the worker had an abnormal exit and restart_time is given during registration of the worker. Thus we need exit_code here. I have removed the new function added though. > 2. > +static void > +check_primary_info(WalReceiverConn *wrconn, bool *am_cascading_standby) > { > ... > + remote_in_recovery = DatumGetBool(slot_getattr(tupslot, 1, &isnull)); > + Assert(!isnull); > + > + /* No need to check further, return that we are cascading standby */ > + if (remote_in_recovery) > + { > + *am_cascading_standby = true; > + CommitTransactionCommand(); > + return; > ... > } > > Don't we need to clear the result and tuple in case of early return? Yes, it was needed. Modified. > > 3. It would be a good idea to mention about requirements like a > physical slot on primary, hot_standby_feedback, etc. in the commit > message. > > 4. > +static bool > +wait_for_primary_slot_catchup(WalReceiverConn *wrconn, RemoteSlot > *remote_slot, > + bool *wait_attempts_exceeded) > { > ... > + tupslot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple); > + if (!tuplestore_gettupleslot(res->tuplestore, true, false, tupslot)) > + { > + ereport(WARNING, > + (errmsg("slot \"%s\" creation aborted", remote_slot->name), > + errdetail("This slot was not found on the primary server"))); > ... > + /* > + * It is possible to get null values for LSN and Xmin if slot is > + * invalidated on the primary server, so handle accordingly. > + */ > + new_invalidated = DatumGetBool(slot_getattr(tupslot, 1, &isnull)); > + Assert(!isnull); > + > + new_restart_lsn = DatumGetLSN(slot_getattr(tupslot, 2, &isnull)); > + if (new_invalidated || isnull) > + { > + ereport(WARNING, > + (errmsg("slot \"%s\" creation aborted", remote_slot->name), > + errdetail("This slot was invalidated on the primary server"))); > ... > } > > a. The errdetail message should end with a full stop. Please check all > other errdetail messages in the patch to follow the same guideline. > b. I think saying slot creation aborted is not completely correct > because we would have created the slot especially when it is in 'i' > state. Can we change it to something like: "aborting initial sync for > slot \"%s\""? > c. Also, if the remote_slot is invalidated, ideally, we can even drop > the local slot but it seems that the patch will drop the same before > the next-sync cycle with any other slot that needs to be dropped. If > so, can we add the comment to indicate the same? I have added comments. Basically, it will be dropped in caller only if it is 'RS_EPHEMERAL' state else if it is already persisted, then will be maintained as is but marked as invalidated in caller and its sync will be skipped next time onwards. > > 5. > +static void > +local_slot_update(RemoteSlot *remote_slot) > +{ > + Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE); > + > + LogicalConfirmReceivedLocation(remote_slot->confirmed_lsn); > + LogicalIncreaseXminForSlot(remote_slot->confirmed_lsn, > +remote_slot->catalog_xmin); > + LogicalIncreaseRestartDecodingForSlot(remote_slot->confirmed_lsn, > + remote_slot->restart_lsn); > + > + SpinLockAcquire(&MyReplicationSlot->mutex); > + MyReplicationSlot->data.invalidated = remote_slot->invalidated; > + SpinLockRelease(&MyReplicationSlot->mutex); > ... > ... > > If required, the invalidated flag is updated in the caller as well, so > why do we need to update it here as well? > It was needed by the part where the slot is not existing and we need to create a new slot. I have now moved the invalidation che
Re: Fixing backslash dot for COPY FROM...CSV
vignesh C wrote: > I noticed that these tests are passing without applying patch too: > +insert into copytest2(test) values('line1'), ('\.'), ('line2'); > +copy (select test from copytest2 order by test collate "C") to :'filename' > csv; > +-- get the data back in with copy > +truncate copytest2; > +copy copytest2(test) from :'filename' csv; > +select test from copytest2 order by test collate "C"; > > I was not sure if this was intentional. Can we add a test which fails > in HEAD and passes with the patch applied. Thanks for checking this out. Indeed, that was not intentional. I've been using static files in my tests and forgot that if the data was produced with COPY OUT, it would quote backslash-dot so that COPY IN could reload it without problem. PFA an updated version that uses \qecho to produce the data instead of COPY OUT. This test on unpatched HEAD shows that copytest2 is missing 2 rows after COPY IN. Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite From 0137963134ac88b009a8e93d6a39cabfaefe43f5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Daniel=20V=C3=A9rit=C3=A9?= Date: Tue, 19 Dec 2023 11:49:35 +0100 Subject: [PATCH v2] Support backslash-dot on a line by itself as valid data in COPY FROM...CSV --- doc/src/sgml/ref/psql-ref.sgml | 4 src/backend/commands/copyfromparse.c | 13 ++- src/bin/psql/copy.c | 32 src/test/regress/expected/copy.out | 18 src/test/regress/sql/copy.sql| 12 +++ 5 files changed, 55 insertions(+), 24 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index cc7d797159..d94e3cacfc 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1119,10 +1119,6 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 'second value' \g destination, because all data must pass through the client/server connection. For large amounts of data the SQL command might be preferable. -Also, because of this pass-through method, \copy -... from in CSV mode will erroneously -treat a \. data value alone on a line as an -end-of-input marker. diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c index f553734582..b4c291b25b 100644 --- a/src/backend/commands/copyfromparse.c +++ b/src/backend/commands/copyfromparse.c @@ -1137,7 +1137,6 @@ CopyReadLineText(CopyFromState cstate) boolresult = false; /* CSV variables */ - boolfirst_char_in_line = true; boolin_quote = false, last_was_esc = false; charquotec = '\0'; @@ -1335,10 +1334,9 @@ CopyReadLineText(CopyFromState cstate) } /* -* In CSV mode, we only recognize \. alone on a line. This is because -* \. is a valid CSV data value. +* In CSV mode, \. is a valid CSV data value anywhere in the line. */ - if (c == '\\' && (!cstate->opts.csv_mode || first_char_in_line)) + if (c == '\\' && !cstate->opts.csv_mode) { charc2; @@ -1442,14 +1440,7 @@ CopyReadLineText(CopyFromState cstate) } } - /* -* This label is for CSV cases where \. appears at the start of a -* line, but there is more text after it, meaning it was a data value. -* We are more strict for \. in CSV mode because \. could be a data -* value, while in non-CSV mode, \. cannot be a data value. -*/ not_end_of_copy: - first_char_in_line = false; } /* end of outer loop */ /* diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c index dbbbdb8898..f31a7acbb6 100644 --- a/src/bin/psql/copy.c +++ b/src/bin/psql/copy.c @@ -620,20 +620,34 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) /* current line is done? */ if (buf[buflen - 1] == '\n') { - /* check for EOF marker, but not on a partial line */ - if (at_line_begin) + /* +* When at the beginning of the line, check for EOF marker. +* If the marker is found and the data is inlined, +* we must stop at this point. If not, the \. line can be +* sent to the server, and we let it decide whether +
pg_waldump
Hi, Is it possible to visualize the DDL with the pg_waldump tool. I created a postgres user but I cannot find the creation command in the wals Thanks for help Fabrice
Re: Improve eviction algorithm in ReorderBuffer
On Tue, Dec 19, 2023 at 8:31 AM Masahiko Sawada wrote: > > On Sun, Dec 17, 2023 at 11:40 AM Amit Kapila wrote: > > > > > > The individual transactions shouldn't cross > > 'logical_decoding_work_mem'. I got a bit confused by your proposal to > > maintain the lists: "...splitting it into two lists: transactions > > consuming 5% < and 5% >= of the memory limit, and checking the 5% >= > > list preferably.". In the previous sentence, what did you mean by > > transactions consuming 5% >= of the memory limit? I got the impression > > that you are saying to maintain them in a separate transaction list > > which doesn't seems to be the case. > > I wanted to mean that there are three lists in total: the first one > maintain the transactions consuming more than 10% of > logical_decoding_work_mem, > How can we have multiple transactions in the list consuming more than 10% of logical_decoding_work_mem? Shouldn't we perform serialization before any xact reaches logical_decoding_work_mem? -- With Regards, Amit Kapila.
Re: Add new for_each macros for iterating over a List that do not require ListCell pointer
On Mon, 18 Dec 2023 at 19:00, Jelte Fennema-Nio wrote: > > The more I think about it and look at the code, the more I like the > usage of the loop style proposed in the previous 0003 patch (which > automatically declares a loop variable for the scope of the loop using > a second for loop). > > I did some testing on godbolt.org and both versions of the macros > result in the same assembly when compiling with -O2 (and even -O1) > when compiling with ancient versions of gcc (5.1) and clang (3.0): > https://godbolt.org/z/WqfTbhe4e > > So attached is now an updated patchset that only includes these even > easier to use foreach macros. I also updated some of the comments and > moved modifying foreach_delete_current and foreach_current_index to > their own commit. > > On Thu, 14 Dec 2023 at 16:54, Jelte Fennema-Nio wrote: > > > > On Fri, 1 Dec 2023 at 05:20, Nathan Bossart > > wrote: > > > Could we simplify it with something like the following? > > > > Great suggestion! Updated the patchset accordingly. > > > > This made it also easy to change the final patch to include the > > automatic scoped declaration logic for all of the new macros. I quite > > like how the calling code changes to not have to declare the variable. > > But it's definitely a larger divergence from the status quo than > > without patch 0003. So I'm not sure if it's desired. > > > > Finally, I also renamed the functions to use foreach instead of > > for_each, since based on this thread that seems to be the generally > > preferred naming. Thanks for working on this, this simplifies foreach further. I noticed that this change can be done in several other places too. I had seen the following parts of code from logical replication files can be changed: 1) The below in pa_detach_all_error_mq function can be changed to foreach_ptr foreach(lc, ParallelApplyWorkerPool) { shm_mq_result res; Size nbytes; void*data; ParallelApplyWorkerInfo *winfo = (ParallelApplyWorkerInfo *) lfirst(lc); 2) The below in logicalrep_worker_detach function can be changed to foreach_ptr foreach(lc, workers) { LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc); if (isParallelApplyWorker(w)) logicalrep_worker_stop_internal(w, SIGTERM); } 3) The below in ApplyLauncherMain function can be changed to foreach_ptr /* Start any missing workers for enabled subscriptions. */ sublist = get_subscription_list(); foreach(lc, sublist) { Subscription *sub = (Subscription *) lfirst(lc); LogicalRepWorker *w; TimestampTz last_start; TimestampTz now; long elapsed; if (!sub->enabled) continue; 4) The below in pa_launch_parallel_worker function can be changed to foreach_ptr ListCell *lc; /* Try to get an available parallel apply worker from the worker pool. */ foreach(lc, ParallelApplyWorkerPool) { winfo = (ParallelApplyWorkerInfo *) lfirst(lc); if (!winfo->in_use) return winfo; } Should we start doing these changes too now? Regards, Vignesh
Re: Transaction timeout
> On 19 Dec 2023, at 13:26, Andrey M. Borodin wrote: > > I don’t have Windows machine, so I hope CF bot will pick this. I used Github CI to produce version of tests that seems to be is stable on Windows. Sorry for the noise. Best regards, Andrey Borodin. v13-0001-Introduce-transaction_timeout.patch Description: Binary data
Re: Simplify newNode()
On 18/12/2023 16:28, Heikki Linnakangas wrote: I think we have consensus on patch v2. It's simpler and not less performant than what we have now, at least on modern compilers. Barring objections, I'll commit that. Committed that. -- Heikki Linnakangas Neon (https://neon.tech)
Re: [DOC] Introducing Quick Start Guide to PL/pgSQL and PL/Python Documentation
I've addressed the points you raised: 1. *Missing `` Tag:* I reviewed the "Create the Makefile" section, and it seems that tags are appropriately used for filenames. If there's a specific instance where you observed a missing tag, please provide more details, and I'll ensure it's addressed. 2. *Use `CREATE EXTENSION` in "extension--version.sql":* Considering that there's already a CREATE EXTENSION step in the quick start guide, I can include a note in the general documentation to explain the rationale without repeating it in the individual script. What do you think? -- Best regards, Ishaan Adarsh On Tue, Dec 19, 2023 at 12:28 PM Japin Li wrote: > > 1. > It seems you miss tag in plpgsql "Create the Makefile": > > + > +Create the Makefile > + > + > + Create a Makefile in the pg_plpgsql_ext > directory with the following content: > + > > 2. > We expected use CREATE EXTENSION to load the extension, should we add the > following in extension--version.sql? > > -- complain if script is sourced in psql, rather than via CREATE EXTENSION > \echo Use "CREATE EXTENSION pair" to load this file. \quit > > -- > Regrads, > Japin Li > ChengDu WenWu Information Technology Co., Ltd. >
Re: Add isCatalogRel in rmgrdesc
Hi, On 12/19/23 9:00 AM, Masahiko Sawada wrote: On Tue, Dec 12, 2023 at 6:15 PM Michael Paquier wrote: On Tue, Dec 12, 2023 at 09:23:46AM +0100, Drouvot, Bertrand wrote: Please find attached a patch to add this field in the related rmgrdesc (i.e all the ones that already provide the snapshotConflictHorizon except the one related to xl_heap_visible: indeed a new bit was added in its flag field in 6af1793954 instead of adding the isCatalogRel bool). I think it's worth it, as this new field could help diagnose conflicts issues (if any). +1 Thanks for looking at it! - appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u", + appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u, isCatalogRel %u", xlrec->locator.spcOid, xlrec->locator.dbOid, xlrec->locator.relNumber, xlrec->block, EpochFromFullTransactionId(xlrec->snapshotConflictHorizon), -XidFromFullTransactionId(xlrec->snapshotConflictHorizon)); +XidFromFullTransactionId(xlrec->snapshotConflictHorizon), +xlrec->isCatalogRel); The patch prints isCatalogRel, a bool field, as %u. But other rmgrdesc implementations seem to use different ways. For instance in spgdesc.c, we print flag name if it's set: (newPage and postfixBlkSame are bool fields): appendStringInfo(buf, "prefixoff: %u, postfixoff: %u", xlrec->offnumPrefix, xlrec->offnumPostfix); if (xlrec->newPage) appendStringInfoString(buf, " (newpage)"); if (xlrec->postfixBlkSame) appendStringInfoString(buf, " (same)"); whereas in hashdesc.c, we print either 'T' of 'F': appendStringInfo(buf, "clear_dead_marking %c, is_primary %c", xlrec->clear_dead_marking ? 'T' : 'F', xlrec->is_primary_bucket_page ? 'T' : 'F'); Is it probably worth considering such formats? Good point, let's not add another format. I prefer to always print the field name like the current patch and hashdesc.c since it's easier to parse it. I like this format too, so done that way in v2 attached. BTW, I noticed that sometimes the snapshotConflictHorizon is displayed as "snapshotConflictHorizon:" and sometimes as "snapshotConflictHorizon". So v2 is doing the same, means using "isCatalogRel:" if "snapshotConflictHorizon:" is being used or using "isCatalogRel" if not. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comFrom 9f7856fa007a2bec2c4c579e4b0730a476950a47 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Mon, 11 Dec 2023 21:10:30 + Subject: [PATCH v2] adding isCatalogRel to rmgrdesc --- src/backend/access/rmgrdesc/gistdesc.c | 10 ++ src/backend/access/rmgrdesc/hashdesc.c | 5 +++-- src/backend/access/rmgrdesc/heapdesc.c | 10 ++ src/backend/access/rmgrdesc/nbtdesc.c | 10 ++ src/backend/access/rmgrdesc/spgdesc.c | 5 +++-- 5 files changed, 24 insertions(+), 16 deletions(-) 100.0% src/backend/access/rmgrdesc/ diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c index 5dc6e1dcee..651e645b85 100644 --- a/src/backend/access/rmgrdesc/gistdesc.c +++ b/src/backend/access/rmgrdesc/gistdesc.c @@ -26,18 +26,20 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec) static void out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec) { - appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u", + appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u, isCatalogRel %c", xlrec->locator.spcOid, xlrec->locator.dbOid, xlrec->locator.relNumber, xlrec->block, EpochFromFullTransactionId(xlrec->snapshotConflictHorizon), - XidFromFullTransactionId(xlrec->snapshotConflictHorizon)); + XidFromFullTransactionId(xlrec->snapshotConflictHorizon), +xlrec->isCatalogRel ? 'T' : 'F'); } static void out_gistxlogDelete(StringInfo buf, gistxlogDelete *xlrec) { - appendStringInfo(buf, "delete: snapshotConflictHorizon %u, nitems: %u", -xlrec->snapshotConflictHorizon, xlrec->ntodelete); + appendStringInfo(buf, "delete: snapshotConflictHorizon %u, nitems: %u, isCatalogRel %c", +xlrec->snapshotConflictHorizon, xlrec->ntodelete, +xlrec->isCatalogRel ? 'T' : 'F'); } static void diff --git a/src/backend/access/rmgrdesc/hashde
Re: Change GUC hashtable to use simplehash?
On Tue, Dec 19, 2023 at 2:32 PM Jeff Davis wrote: > > On Mon, 2023-12-18 at 13:39 +0700, John Naylor wrote: > > For now just two: > > v10-0002 is Jeff's change to the search path cache, but with the > > chunked interface that I found to be faster. > > Did you consider specializing for the case of an aligned pointer? If > it's a string (c string or byte string) it's almost always going to be > aligned, right? That wasn't the next place I thought to look (that would be the strcmp call), but something like this could be worthwhile. If we went this far, I'd like to get more use out of it than one call site. I think a few other places have as their hash key a string along with other values, so maybe we can pass an initialized hash state for strings separately from combining in the other values. Dynahash will still need to deal with truncation, so would need duplicate coding, but I'm guessing with that truncation check it's makes an optimization like you propose even more worthwhile. > I hacked up a patch (attached). I lost track of which benchmark we're > using to test the performance, but when I test in a loop it seems > substantially faster. That's interesting. Note that there is no need for a new fasthash_accum64(), since we can do fasthash_accum(&hs, buf, FH_SIZEOF_ACCUM); ...and the compiler should elide the switch statement. > It reads past the NUL byte, but only to the next alignment boundary, > which I think is OK (though I think I'd need to fix the patch for when > maxalign < 8). Seems like it, on both accounts.
Re: Fixing backslash dot for COPY FROM...CSV
On Tue, 19 Dec 2023 at 02:06, Daniel Verite wrote: > > Hi, > > PFA a patch that attempts to fix the bug that \. on a line > by itself is handled incorrectly by COPY FROM ... CSV. > This issue has been discussed several times previously, > for instance in [1] and [2], and mentioned in the > doc for \copy in commit 42d3125. > > There's one case that works today: when > the line is part of a multi-line quoted section, > and the data is read from a file, not from the client. > In other situations, an error is raised or the data is cut at > the point of \. without an error. > > The patch addresses that issue in the server and in psql, > except for the case of inlined data, where \. cannot be > both valid data and an EOF marker at the same time, so > it keeps treating it as an EOF marker. I noticed that these tests are passing without applying patch too: +++ b/src/test/regress/sql/copy.sql @@ -38,6 +38,17 @@ copy copytest2 from :'filename' csv quote escape E'\\'; select * from copytest except select * from copytest2; +--- test unquoted .\ as data inside CSV + +truncate copytest2; + +insert into copytest2(test) values('line1'), ('\.'), ('line2'); +copy (select test from copytest2 order by test collate "C") to :'filename' csv; +-- get the data back in with copy +truncate copytest2; +copy copytest2(test) from :'filename' csv; +select test from copytest2 order by test collate "C"; I was not sure if this was intentional. Can we add a test which fails in HEAD and passes with the patch applied. Regards, Vignesh
Re: Transaction timeout
> On 19 Dec 2023, at 06:25, Japin Li wrote: > > On Windows, there still have an error: Uhhmm, yes. Connection termination looks different on windows machine. I’ve checked how this looks in relication slot tests and removed select that was observing connection failure. I don’t have Windows machine, so I hope CF bot will pick this. Best regards, Andrey Borodin. v12-0001-Introduce-transaction_timeout.patch Description: Binary data
Re: Add isCatalogRel in rmgrdesc
On Tue, Dec 12, 2023 at 6:15 PM Michael Paquier wrote: > > On Tue, Dec 12, 2023 at 09:23:46AM +0100, Drouvot, Bertrand wrote: > > Please find attached a patch to add this field in the related rmgrdesc (i.e > > all the ones that already provide the snapshotConflictHorizon except the one > > related to xl_heap_visible: indeed a new bit was added in its flag field in > > 6af1793954 > > instead of adding the isCatalogRel bool). > > > > I think it's worth it, as this new field could help diagnose conflicts > > issues (if any). +1 - appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u", + appendStringInfo(buf, "rel %u/%u/%u; blk %u; snapshotConflictHorizon %u:%u, isCatalogRel %u", xlrec->locator.spcOid, xlrec->locator.dbOid, xlrec->locator.relNumber, xlrec->block, EpochFromFullTransactionId(xlrec->snapshotConflictHorizon), -XidFromFullTransactionId(xlrec->snapshotConflictHorizon)); +XidFromFullTransactionId(xlrec->snapshotConflictHorizon), +xlrec->isCatalogRel); The patch prints isCatalogRel, a bool field, as %u. But other rmgrdesc implementations seem to use different ways. For instance in spgdesc.c, we print flag name if it's set: (newPage and postfixBlkSame are bool fields): appendStringInfo(buf, "prefixoff: %u, postfixoff: %u", xlrec->offnumPrefix, xlrec->offnumPostfix); if (xlrec->newPage) appendStringInfoString(buf, " (newpage)"); if (xlrec->postfixBlkSame) appendStringInfoString(buf, " (same)"); whereas in hashdesc.c, we print either 'T' of 'F': appendStringInfo(buf, "clear_dead_marking %c, is_primary %c", xlrec->clear_dead_marking ? 'T' : 'F', xlrec->is_primary_bucket_page ? 'T' : 'F'); Is it probably worth considering such formats? I prefer to always print the field name like the current patch and hashdesc.c since it's easier to parse it. But I'm fine with either way to show the field value. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com