Re: Disable LLVM bitcode generation with pgxs.mk framework.
On 12.03.24 14:38, Xing Guo wrote: When the PostgreSQL server is configured with --with-llvm, the pgxs.mk framework will generate LLVM bitcode for extensions automatically. Sometimes, I don't want to generate bitcode for some extensions. I can turn off this feature by specifying with_llvm=0 in the make command. ``` make with_llvm=0 ``` Would it be possible to add a new switch in the pgxs.mk framework to allow users to disable this feature? E.g., the Makefile looks like: ``` WITH_LLVM=no PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) ``` Can't you just put the very same with_llvm=0 into the makefile?
Re: MERGE ... RETURNING
Hi, some minor issues: [ WITH with_query [, ...] ] MERGE INTO [ ONLY ] target_table_name [ * ] [ [ AS ] target_alias ] USING data_source ON join_condition when_clause [...] [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ] here the "WITH" part should have "[ RECURSIVE ]" like: [ WITH [ RECURSIVE ] with_query [, ...] ] + An expression to be computed and returned by the MERGE + command after each row is merged. The expression can use any columns of + the source or target tables, or the + function to return additional information about the action executed. + should be: + An expression to be computed and returned by the MERGE + command after each row is changed. one minor issue: add ` table sq_target; table sq_source; ` before `-- RETURNING` in src/test/regress/sql/merge.sql, so we can easily understand the tests.
Re: meson vs tarballs
On 2024-03-13 We 02:31, Andrew Dunstan wrote: On 2024-03-13 We 02:22, Peter Eisentraut wrote: On 13.03.24 07:11, Andrew Dunstan wrote: I and several colleagues have just been trying to build from a tarball with meson. That seems pretty awful and unfriendly and I didn't see anything about it in the docs. At https://www.postgresql.org/docs/16/install-requirements.html is says: """ Alternatively, PostgreSQL can be built using Meson. This is currently experimental and only works when building from a Git checkout (not from a distribution tarball). """ Ah!. Darn, I missed that. Thanks. Of course, when release 17 comes out this had better not be the case, since we have removed the custom Windows build system. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Support json_errdetail in FRONTEND builds
On Tue, Mar 12, 2024 at 08:38:43PM -0400, Andrew Dunstan wrote: > On 2024-03-12 Tu 14:43, Jacob Champion wrote: >> Two notes that I wanted to point out explicitly: >> - On the other thread, Daniel contributed a destroyStringInfo() >> counterpart for makeStringInfo(), which is optional but seemed useful >> to include here. > > yeah, although maybe worth a different patch. - { - pfree(lex->strval->data); - pfree(lex->strval); - } + destroyStringInfo(lex->strval); I've wanted that a few times, FWIW. I would do a split, mainly for clarity. >> - After this patch, if a frontend client calls json_errdetail() >> without calling freeJsonLexContext(), it will leak memory. > > Not too concerned about this: > > 1. we tend to be a bit more relaxed about that in frontend programs, > especially those not expected to run for long times and especially on error > paths, where in many cases the program will just exit anyway. > > 2. the fix is simple where it's needed. This does not stress me much, either. I can see that OAuth introduces at least two calls of json_errdetail() in libpq, and that would matter there. case JSON_EXPECTED_STRING: -return psprintf(_("Expected string, but found \"%s\"."), -extract_token(lex)); +appendStringInfo(lex->errormsg, + _("Expected string, but found \"%.*s\"."), + toklen, lex->token_start); Maybe this could be wrapped in a macro to avoid copying around token_start and toklen for all the error cases. -- Michael signature.asc Description: PGP signature
Re: meson vs tarballs
On 2024-03-13 We 02:22, Peter Eisentraut wrote: On 13.03.24 07:11, Andrew Dunstan wrote: I and several colleagues have just been trying to build from a tarball with meson. That seems pretty awful and unfriendly and I didn't see anything about it in the docs. At https://www.postgresql.org/docs/16/install-requirements.html is says: """ Alternatively, PostgreSQL can be built using Meson. This is currently experimental and only works when building from a Git checkout (not from a distribution tarball). """ Ah!. Darn, I missed that. Thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Race condition in FetchTableStates() breaks synchronization of subscription tables
On Wed, 13 Mar 2024 at 10:12, Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 13, 2024 11:49 AMvignesh C wrote: > > On Tue, 12 Mar 2024 at 09:34, Ajin Cherian wrote: > > > > > > > > > > > > On Tue, Mar 12, 2024 at 2:59 PM vignesh C wrote: > > >> > > >> > > >> Thanks, I have created the following Commitfest entry for this: > > >> https://commitfest.postgresql.org/47/4816/ > > >> > > >> Regards, > > >> Vignesh > > > > > > > > > Thanks for the patch, I have verified that the fix works well by > > > following the > > steps mentioned to reproduce the problem. > > > Reviewing the patch, it seems good and is well documented. Just one minor > > comment I had was probably to change the name of the variable > > table_states_valid to table_states_validity. The current name made sense > > when > > it was a bool, but now that it is a tri-state enum, it doesn't fit well. > > > > Thanks for reviewing the patch, the attached v6 patch has the changes for > > the > > same. > > Thanks for the patches. > > I saw a recent similar BF error[1] which seems related to the issue that 0001 > patch is trying to solve. i.e. The table sync worker is somehow not started > after refreshing the publication on the subscriber. I didn't see other > related ERRORs in > the log, so I think the reason is the same as the one being discussed in this > thread, which is the table state invalidation got lost. And the 0001 patch > looks good to me. > > For 0002, instead of avoid resetting the latch, is it possible to let the > logical rep worker wake up the launcher once after attaching ? Waking up of the launch process uses the same latch that is used for subscription creation/modification and apply worker process exit. As the handling of this latch for subscription creation/modification and worker process exit can be done only by ApplyLauncherMain, we will not be able to reset the latch in WaitForReplicationWorkerAttach. I feel waking up the launcher process might not help in this case as currently we will not be able to differentiate between worker attached, subscription creation/modification and apply worker process exit. Regards, Vignesh
Re: meson vs tarballs
On 13.03.24 07:11, Andrew Dunstan wrote: I and several colleagues have just been trying to build from a tarball with meson. That seems pretty awful and unfriendly and I didn't see anything about it in the docs. At https://www.postgresql.org/docs/16/install-requirements.html is says: """ Alternatively, PostgreSQL can be built using Meson. This is currently experimental and only works when building from a Git checkout (not from a distribution tarball). """
Re: Add basic tests for the low-level backup method.
On Wed, Mar 13, 2024 at 01:12:28PM +1300, David Steele wrote: > On 2/29/24 16:55, Michael Paquier wrote: >> +unlink("$backup_dir/postmaster.pid") >> +or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid"); >> +unlink("$backup_dir/postmaster.opts") >> +or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts"); >> +unlink("$backup_dir/global/pg_control") >> +or BAIL_OUT("unable to unlink $backup_dir/global/pg_control"); >> >> RELCACHE_INIT_FILENAME as well? > > I'm not trying to implement the full exclusion list here, just enough to get > the test working. Since exclusions are optional according to the docs I > don't think we need them for a valid tests. Okay. Fine by me at the end. >> +# Rather than writing out backup_label, try to recover the backup without >> +# backup_label to demonstrate that recovery will not work correctly without >> it, >> +# i.e. the canary table will be missing and the cluster will be corrupt. >> Provide >> +# only the WAL segment that recovery will think it needs. >> >> Okay, why not. No objections to this addition. I am a bit surprised >> that this is not scanning some of the logs produced by the startup >> process for particular patterns. > > Not sure what to look for here. There are no distinct messages for crash > recovery. Perhaps there should be? The closest thing I can think of here would be "database system was not properly shut down; automatic recovery in progress" as we don't have InArchiveRecovery, after checking that the canary is missing. If you don't like this suggestion, feel free to say so, of course :) >> +# Save backup_label into the backup directory and recover using the >> primary's >> +# archive. This time recovery will succeed and the canary table will be >> +# present. >> >> Here are well, I think that we should add some log_contains() with >> pre-defined patterns to show that recovery has completed the way we >> want it with a backup_label up to the end-of-backup record. > > Sure, I added a check for the new log message when recovering with a > backup_label. +ok($node_replica->log_contains('completed backup recovery with redo LSN'), + 'verify backup recovery performed with backup_label'); Okay for this choice. I was thinking first about "starting backup recovery with redo LSN", closer to the area where the backup_label is read. -- Michael signature.asc Description: PGP signature
Re: speed up a logical replica setup
On Fri, Mar 8, 2024, at 6:44 AM, Hayato Kuroda (Fujitsu) wrote: > Hmm, right. I considered below improvements. Tomas and Euler, how do you > think? I'm posting a new patchset v28. I changed the way that the check function works. From the usability perspective, it is better to test all conditions and reports all errors (if any) at once. It avoids multiple executions in dry run mode just to figure out all of the issues in the initial phase. I also included tests for it using Shlok's idea [1] although I didn't use v27-0004. Shlok [1] reported that it was failing on Windows since the socket-directory option was added. I added a fix for it. Tomas pointed out the documentation [2] does not provide a good high level explanation about pg_createsubscriber. I expanded the Description section and moved the prerequisites to Nodes section. The prerequisites were grouped into target and source conditions on their own paragraph instead of using a list. It seems more in line with the style of some applications. As I said in a previous email [3], I removed the retain option. [1] https://www.postgresql.org/message-id/canhcyeu4q3dwh9ax9bpojcm4ebbhyfenegoaz8xfgyjmcpz...@mail.gmail.com [2] https://www.postgresql.org/message-id/6423dfeb-a729-45d3-b71e-7bf1b3adb0c9%40enterprisedb.com [3] https://www.postgresql.org/message-id/e390e35e-508e-4eb8-92e4-e6b066407a41%40app.fastmail.com -- Euler Taveira EDB https://www.enterprisedb.com/ v28-0001-pg_createsubscriber-creates-a-new-logical-replic.patch.gz Description: application/gzip v28-0002-Use-last-replication-slot-position-as-replicatio.patch.gz Description: application/gzip v28-0003-port-replace-int-with-string.patch.gz Description: application/gzip
meson vs tarballs
I and several colleagues have just been trying to build from a tarball with meson. `meson setup build .` results in this: [...] Message: checking for file conflicts between source and build directory meson.build:2963:2: ERROR: Problem encountered: Non-clean source code directory detected. To build with meson the source tree may not have an in-place, ./configure style, build configured. You can have both meson and ./configure style builds for the same source tree by building out-of-source / VPATH with configure. Alternatively use a separate check out for meson based builds. Conflicting files in source directory: [huge list of files] The conflicting files need to be removed, either by removing the files listed above, or by running configure and then make maintainer-clean. That seems pretty awful and unfriendly and I didn't see anything about it in the docs. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
On Tue, 12 Mar 2024 22:07:17 -0500 Nathan Bossart wrote: > I did some light editing to prepare this for commit. Thank you. I confirmed the test you improved and I am fine with that. Regards, Yugo Nagata > > -- > Nathan Bossart > Amazon Web Services: https://aws.amazon.com -- Yugo NAGATA
Re: Sequence Access Methods, round two
On 12.03.24 00:44, Michael Paquier wrote: Anyway, there is one piece of this patch set that I think has a lot of value outside of the discussion with access methods, which is to redesign pg_sequence_last_value so as it returns a (last_value, is_called) tuple rather than a (last_value). This has the benefit of switching pg_dump to use this function rather than relying on a scan of the heap table used by a sequence to retrieve the state of a sequence dumped. This is the main diff: -appendPQExpBuffer(query, - "SELECT last_value, is_called FROM %s", - fmtQualifiedDumpable(tbinfo)); +/* + * In versions 17 and up, pg_sequence_last_value() has been switched to + * return a tuple with last_value and is_called. + */ +if (fout->remoteVersion >= 17) +appendPQExpBuffer(query, + "SELECT last_value, is_called " + "FROM pg_sequence_last_value('%s')", + fmtQualifiedDumpable(tbinfo)); +else +appendPQExpBuffer(query, + "SELECT last_value, is_called FROM %s", + fmtQualifiedDumpable(tbinfo)); Are there any objections to that? pg_sequence_last_value() is something that we've only been relying on internally for the catalog pg_sequences. I don't understand what the overall benefit of this change is supposed to be. If this route were to be pursued, it should be a different function name. We shouldn't change the signature of an existing function.
Re: Transaction timeout
> On 13 Mar 2024, at 05:23, Alexander Korotkov wrote: > > On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin > wrote: >>> On 11 Mar 2024, at 16:18, Alexander Korotkov wrote: >>> >>> I think if checking psql stderr is problematic, checking just logs is >>> fine. Could we wait for the relevant log messages one by one with >>> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do? >> >> PFA version with $node->wait_for_log() > > I've slightly revised the patch. I'm going to push it if no objections. One small note: log_offset was updated, but was not used. Thanks! Best regards, Andrey Borodin. v6-0001-Add-TAP-tests-for-timeouts.patch Description: Binary data
Re: POC, WIP: OR-clause support for indexes
On 12/3/2024 22:20, Alexander Korotkov wrote: On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov I think you are right. It is probably a better place than any other to remove duplicates in an array. I just think we should sort and remove duplicates from entry->consts in one pass. Thus, this optimisation should be applied to sortable constants. Ok. New version of the patch set implemented all we have agreed on for now. We can return MAX_SAOP_ARRAY_SIZE constraint and Alena's approach to duplicates deletion for non-sortable cases at the end. Hmm, we already tried to do it at that point. I vaguely recall some issues caused by this approach. Anyway, it should be done as quickly as possible to increase the effect of the optimization. I think there were provided quite strong reasons why this shouldn't be implemented at the parse analysis stage [1], [2], [3]. The canonicalize_qual() looks quite appropriate place for that since it does similar transformations. Ok. Let's discuss these reasons. In Robert's opinion [1,3], we should do the transformation based on the cost model. But in the canonicalize_qual routine, we still make the transformation blindly. Moreover, the second patch reduces the weight of this reason, doesn't it? Maybe we shouldn't think about that as about optimisation but some 'general form of expression'? Peter [2] worries about the possible transformation outcomes at this stage. But remember, we already transform clauses like ROW() IN (...) to a series of ORs here, so it is not an issue. Am I wrong? Why did we discard the attempt with canonicalize_qual on the previous iteration? - The stage of parsing is much more native for building SAOP quals. We can reuse make_scalar_array_op and other stuff, for example. During the optimisation stage, the only list partitioning machinery creates SAOP based on a list of constants. So, in theory, it is possible to implement. But do we really need to make the code more complex? Links. 1. https://www.postgresql.org/message-id/CA%2BTgmoZCgP6FrBQEusn4yaWm02XU8OPeoEMk91q7PRBgwaAkFw%40mail.gmail.com 2. https://www.postgresql.org/message-id/CAH2-Wzm2%3Dnf_JhiM3A2yetxRs8Nd2NuN3JqH%3Dfm_YWYd1oYoPg%40mail.gmail.com 3. https://www.postgresql.org/message-id/CA%2BTgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1%2BCuVQE-eFTqBmA%40mail.gmail.com -- regards, Andrei Lepikhov Postgres Professional From 86d969f2598a03b1eba84c0c064707de34ee60ac Mon Sep 17 00:00:00 2001 From: Alena Rybakina Date: Fri, 2 Feb 2024 22:01:09 +0300 Subject: [PATCH 1/2] Transform OR clauses to ANY expression. Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) on the preliminary stage of optimization when we are still working with the expression tree. Here C is a constant expression, 'expr' is non-constant expression, 'op' is an operator which returns boolean result and has a commuter (for the case of reverse order of constant and non-constant parts of the expression, like 'CX op expr'). Sometimes it can lead to not optimal plan. But we think it is better to have array of elements instead of a lot of OR clauses. Here is a room for further optimizations on decomposing that array into more optimal parts. Authors: Alena Rybakina , Andrey Lepikhov Reviewed-by: Peter Geoghegan , Ranier Vilela Reviewed-by: Alexander Korotkov , Robert Haas Reviewed-by: jian he --- .../postgres_fdw/expected/postgres_fdw.out| 8 +- doc/src/sgml/config.sgml | 17 + src/backend/nodes/queryjumblefuncs.c | 27 ++ src/backend/parser/parse_expr.c | 416 ++ src/backend/utils/misc/guc_tables.c | 11 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/bin/pg_dump/t/002_pg_dump.pl | 6 +- src/include/nodes/queryjumble.h | 1 + src/include/optimizer/optimizer.h | 1 + src/test/regress/expected/create_index.out| 156 ++- src/test/regress/expected/join.out| 62 ++- src/test/regress/expected/partition_prune.out | 215 - src/test/regress/expected/stats_ext.out | 12 +- src/test/regress/expected/sysviews.out| 3 +- src/test/regress/expected/tidscan.out | 23 +- src/test/regress/sql/create_index.sql | 35 ++ src/test/regress/sql/join.sql | 10 + src/test/regress/sql/partition_prune.sql | 22 + src/test/regress/sql/tidscan.sql | 6 + src/tools/pgindent/typedefs.list | 2 + 20 files changed, 974 insertions(+), 60 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 58a603ac56..a965b43cc6 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -8838,18 +8838,18 @@ insert into utrtest values (2, 'qux'); -- Check case where the foreign partition is a subplan target rel explain (verbose, costs off) update utrtest
Re: type cache cleanup improvements
On Tue, Mar 12, 2024 at 06:55:41PM +0300, Teodor Sigaev wrote: > Playing around I found one more place which could easily modified with > hash_seq_init_with_hash_value() call. I think that this patch should be split for clarity, as there are a few things that are independently useful. I guess something like that: - Introduction of hash_initial_lookup(), that simplifies 3 places of dynahash.c where the same code is used. The routine should be inlined. - The split in hash_seq_search to force a different type of search is weird, complicating the dynahash interface by hiding what seems like a search mode. Rather than hasHashvalue that's hidden in the middle of HASH_SEQ_STATUS, could it be better to have an entirely different API for the search? That should be a patch on its own, as well. - The typcache changes. -- Michael signature.asc Description: PGP signature
Re: Improve the connection failure error messages
FYI -- some more code has been pushed since this patch was last updated. AFAICT perhaps you'll want to update this patch again for the following new connection messages on HEAD: - slotfuncs.c [1] - slotsync.c [2] -- [1] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989 [2] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258 Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy wrote: > > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > > > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > > > wrote: > > > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > > > >> > Unless I am misinterpreting some details, ISTM we could rename this > > > >> > column > > > >> > to invalidation_reason and use it for both logical and physical > > > >> > slots. I'm > > > >> > not seeing a strong need for another column. > > > >> > > > >> Yeah having two columns was more for convenience purpose. Without the > > > >> "conflict" > > > >> one, a slot conflicting with recovery would be "a logical slot having > > > >> a non NULL > > > >> invalidation_reason". > > > >> > > > >> I'm also fine with one column if most of you prefer that way. > > > > > > > > While we debate on the above, please find the attached v7 patch set > > > > after rebasing. > > > > > > It looks like Bertrand is okay with reusing the same column for both > > > logical and physical slots > > > > Yeah, I'm okay with one column. > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. JFYI, the patch does not apply to the head. There is a conflict in multiple files. thanks Shveta
Re: meson: Specify -Wformat as a common warning flag for extensions
On Tue Mar 12, 2024 at 6:56 PM CDT, Sutou Kouhei wrote: Hi, In "Re: meson: Specify -Wformat as a common warning flag for extensions" on Fri, 08 Mar 2024 10:05:27 -0600, "Tristan Partin" wrote: > Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level > to 1 in the Meson project() call, which implies -Wall, and set -Wall > in CFLAGS for autoconf. That's the reason we don't get issues building > Postgres. A user making use of the pg_config --cflags option, as Sutou > is, *will* run into the aforementioned issues, since we don't > propogate -Wall into pg_config. > > $ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null >cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’ >[-Wformat-security] >$ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null >(nothing printed) Thanks for explaining this. You're right. This is the reason why we don't need this for PostgreSQL itself but we need this for PostgreSQL extensions. Sorry. I should have explained this in the first e-mail... What should we do to proceed this patch? Perhaps adding some more clarification in the comments that I wrote. - # -Wformat-security requires -Wformat, so check for it + # -Wformat-secuirty requires -Wformat. We compile with -Wall in + # Postgres, which includes -Wformat=1. -Wformat is shorthand for + # -Wformat=1. The set of flags which includes -Wformat-security is + # persisted into pg_config --cflags, which is commonly used by + # PGXS-based extensions. The lack of -Wformat in the persisted flags + # will produce a warning on many GCC versions, so even though adding + # -Wformat here is a no-op for Postgres, it silences other use cases. That might be too long-winded though :). -- Tristan Partin Neon (https://neon.tech)
Re: Add new error_action COPY ON_ERROR "log"
On Wed, Mar 13, 2024 at 08:08:42AM +0530, Bharath Rupireddy wrote: > On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier wrote: >> The attached 0003 is what I had in mind: >> - Simplification of the LOG generated with quotes applied around the >> column name, don't see much need to add the relation name, either, for >> consistency and because the knowledge is known in the query. >> - A few more tests. >> - Some doc changes. > > LGMT. So, I've merged those changes into 0001 and 0002. I've applied the extra tests for now, as this was really confusing. Hmm. This NOTICE is really bugging me. It is true that the clients would get more information, but the information is duplicated on the server side because the error context provides the same information as the NOTICE itself: NOTICE: data type incompatibility at line 1 for column "a" CONTEXT: COPY aa, line 1, column a: "a" STATEMENT: copy aa from stdin with (on_error ignore, log_verbosity verbose); -- Michael signature.asc Description: PGP signature
Re: RFC: Logging plan of the running query
On Fri, Feb 16, 2024 at 11:42 PM torikoshia wrote: I'm not so sure about the implementation now, i.e. finding the next node to be executed from the planstate tree, but I'm going to try this approach. Attached a patch which takes this approach. - I saw no way to find the next node to be executed from the planstate tree, so the patch wraps all the ExecProcNode of the planstate tree at CHECK_FOR_INTERRUPTS(). - To prevent overhead of this wrapped function call, unwrap it at the end of EXPLAIN code execution. - I first tried to use ExecSetExecProcNode() for wrapping, but it 'changes' ExecProcNodeMtd of nodes, not 'adds' some process to ExecProcNodeMtd. I'm not sure this is the right approach, but attached patch adds new member ExecProcNodeOriginal to PlanState to preserve original ExecProcNodeMtd. Any comments are welcomed. -- Regards, -- Atsushi Torikoshi NTT DATA Group CorporationFrom a374bf485e6e7237314179313ac7cb61a0ad784b Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Wed, 13 Mar 2024 13:47:18 +0900 Subject: [PATCH v38] Add function to log the plan of the query Currently, we have to wait for the query execution to finish to check its plan. This is not so convenient when investigating long-running queries on production environments where we cannot use debuggers. To improve this situation, this patch adds pg_log_query_plan() function that requests to log the plan of the specified backend process. By default, only superusers are allowed to request to log the plans because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, EXPLAIN codes for logging is wrapped to every ExecProcNode and when the executor runs one of ExecProcNode, the plan is logged. These EXPLAIN codes are unwrapped when EXPLAIN codes actually run once. In this way, we can avoid adding the overhead which we'll face when adding CHECK_FOR_INTERRUPTS() like mechanisms in somewhere in executor codes. --- contrib/auto_explain/Makefile | 2 + contrib/auto_explain/auto_explain.c | 23 +- contrib/auto_explain/t/001_auto_explain.pl| 35 +++ doc/src/sgml/func.sgml| 50 src/backend/access/transam/xact.c | 17 ++ src/backend/catalog/system_functions.sql | 2 + src/backend/commands/explain.c| 228 +- src/backend/executor/execMain.c | 19 ++ src/backend/storage/ipc/procsignal.c | 4 + src/backend/tcop/postgres.c | 4 + src/backend/utils/init/globals.c | 2 + src/include/catalog/pg_proc.dat | 6 + src/include/commands/explain.h| 9 + src/include/miscadmin.h | 1 + src/include/nodes/execnodes.h | 3 + src/include/storage/procsignal.h | 1 + src/include/tcop/pquery.h | 2 +- src/include/utils/elog.h | 1 + .../injection_points/injection_points.c | 11 + src/test/regress/expected/misc_functions.out | 54 - src/test/regress/sql/misc_functions.sql | 41 +++- 21 files changed, 473 insertions(+), 42 deletions(-) diff --git a/contrib/auto_explain/Makefile b/contrib/auto_explain/Makefile index efd127d3ca..64fe0e0573 100644 --- a/contrib/auto_explain/Makefile +++ b/contrib/auto_explain/Makefile @@ -8,6 +8,8 @@ PGFILEDESC = "auto_explain - logging facility for execution plans" TAP_TESTS = 1 +EXTRA_INSTALL = src/test/modules/injection_points + ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index 677c135f59..e041b10b0e 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -401,26 +401,9 @@ explain_ExecutorEnd(QueryDesc *queryDesc) es->format = auto_explain_log_format; es->settings = auto_explain_log_settings; - ExplainBeginOutput(es); - ExplainQueryText(es, queryDesc); - ExplainQueryParameters(es, queryDesc->params, auto_explain_log_parameter_max_length); - ExplainPrintPlan(es, queryDesc); - if (es->analyze && auto_explain_log_triggers) -ExplainPrintTriggers(es, queryDesc); - if (es->costs) -ExplainPrintJITSummary(es, queryDesc); - ExplainEndOutput(es); - - /* Remove last line break */ - if (es->str->len > 0 && es->str->data[es->str->len - 1] == '\n') -es->str->data[--es->str->len] = '\0'; - - /* Fix JSON to output an object */ - if (auto_explain_log_format == EXPLAIN_FORMAT_JSON) - { -es->str->data[0] = '{'; -es->str->data[es->str->len - 1] = '}'; - } + ExplainAssembleLogOutput(es, queryDesc, auto_explain_log_format, + auto_explain_log_triggers, + auto_explain_log_parameter_max_length); /* * Note: we rely on the existing logging of context or diff --git a/con
Re: A failure in t/038_save_logical_slots_shutdown.pl
On Tue, Mar 12, 2024 at 8:46 PM Bharath Rupireddy wrote: > > On Tue, Mar 12, 2024 at 6:05 PM Amit Kapila wrote: > > > > The patch looks mostly good to me. I have changed the comments and > > commit message in the attached. I am planning to push this tomorrow > > unless you or others have any comments on it. > > LGTM. > Pushed. -- With Regards, Amit Kapila.
RE: Race condition in FetchTableStates() breaks synchronization of subscription tables
On Wednesday, March 13, 2024 11:49 AMvignesh C wrote: > On Tue, 12 Mar 2024 at 09:34, Ajin Cherian wrote: > > > > > > > > On Tue, Mar 12, 2024 at 2:59 PM vignesh C wrote: > >> > >> > >> Thanks, I have created the following Commitfest entry for this: > >> https://commitfest.postgresql.org/47/4816/ > >> > >> Regards, > >> Vignesh > > > > > > Thanks for the patch, I have verified that the fix works well by following > > the > steps mentioned to reproduce the problem. > > Reviewing the patch, it seems good and is well documented. Just one minor > comment I had was probably to change the name of the variable > table_states_valid to table_states_validity. The current name made sense when > it was a bool, but now that it is a tri-state enum, it doesn't fit well. > > Thanks for reviewing the patch, the attached v6 patch has the changes for the > same. Thanks for the patches. I saw a recent similar BF error[1] which seems related to the issue that 0001 patch is trying to solve. i.e. The table sync worker is somehow not started after refreshing the publication on the subscriber. I didn't see other related ERRORs in the log, so I think the reason is the same as the one being discussed in this thread, which is the table state invalidation got lost. And the 0001 patch looks good to me. For 0002, instead of avoid resetting the latch, is it possible to let the logical rep worker wake up the launcher once after attaching ? [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=francolin&dt=2024-03-11%2000%3A52%3A42 Best Regards, Hou zj
Re: POC: Extension for adding distributed tracing - pg_tracing
On Tue, 12 Mar 2024, 23:45 Anthonin Bonnefoy, < anthonin.bonne...@datadoghq.com> wrote: > > > - I don't think it's a good idea to do memory allocations in the middle > of a > > PG_CATCH. If the error was due to out-of-memory, you'll throw another > error. > Good point. I was wondering what were the risks of generating spans > for errors. I will try to find a better way to handle this. > The usual approach is to have pre-allocated memory. This must actually be written (zeroed usually) or it might be lazily allocated only on page fault. And it can't be copy on write memory allocated once in postmaster startup then inherited by fork. That imposes an overhead for every single postgres backend. So maybe there's a better solution.
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Tue, Mar 12, 2024 at 10:10 PM Bharath Rupireddy wrote: > > On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila wrote: > > > > Yes, your understanding is correct. I wanted us to consider having new > > parameters like 'inactive_replication_slot_timeout' to be at > > slot-level instead of GUC. I think this new parameter doesn't seem to > > be the similar as 'max_slot_wal_keep_size' which leads to truncation > > of WAL at global and then invalidates the appropriate slots. OTOH, the > > 'inactive_replication_slot_timeout' doesn't appear to have a similar > > global effect. > > last_inactive_at is tracked for each slot using which slots get > invalidated based on inactive_replication_slot_timeout. It's like > max_slot_wal_keep_size invalidating slots based on restart_lsn. In a > way, both are similar, right? > There is some similarity but 'max_slot_wal_keep_size' leads to truncation of WAL which in turn leads to invalidation of slots. Here, I am also trying to be cautious in adding a GUC unless it is required or having a slot-level parameter doesn't serve the need. Having said that, I see that there is an argument that we should follow the path of 'max_slot_wal_keep_size' GUC and there is some value to it but still I think avoiding a new GUC for inactivity in the slot would outweigh. -- With Regards, Amit Kapila.
Re: remaining sql/json patches
On Tue, Mar 12, 2024 at 5:37 PM Amit Langote wrote: > > > SELECT JSON_EXISTS(jsonb '{"customer_name": "test", "salary":1000, > "department_id":1}', '$ ? (@.department_id == $dept_id && @.salary == > $sal)' PASSING 1000 AS sal, 1 as dept_id); > json_exists > - > t > (1 row) > > Does that make sense? > > Yes, got it. Thanks for the clarification. -- Regards, Himanshu Upadhyaya EnterpriseDB: http://www.enterprisedb.com
Re: perl: unsafe empty pattern behavior
Andrew Dunstan writes: > On 2024-03-12 Tu 18:59, Tom Lane wrote: >> I wonder whether perlcritic has sufficiently deep understanding of >> Perl code that it could find these hazards. > Yeah, that was my thought too. I'd start with ProhibitComplexRegexes.pm > as a template. Oooh. Taking a quick look at the source code: https://metacpan.org/dist/Perl-Critic/source/lib/Perl/Critic/Policy/RegularExpressions/ProhibitComplexRegexes.pm it seems like it'd be pretty trivial to convert that from "complain if regex contains more than N characters" to "complain if regex contains zero characters". regards, tom lane
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot wrote: > > On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote: > > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot > > wrote: > > > so why to prevent it for > > these new parameters? This will unnecessarily create inconsistency in > > the invalidation behavior. > > Yeah, but I think wal removal has a direct impact on the slot usuability which > is probably not the case with the new XID and Timeout ones. > BTW, is XID the based parameter 'max_slot_xid_age' not have similarity with 'max_slot_wal_keep_size'? I think it will impact the rows we removed based on xid horizons. Don't we need to consider it while vacuum computing the xid horizons in ComputeXidHorizons() similar to what we do for WAL w.r.t 'max_slot_wal_keep_size'? -- With Regards, Amit Kapila.
Re: perl: unsafe empty pattern behavior
On 2024-03-12 Tu 18:59, Tom Lane wrote: Jeff Davis writes: On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote: I also tried grepping (for things like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you have ... but I only looked for the "qr" literal, not other ways to get regexes. I think that's fine. qr// seems the most dangerous, because it seems to behave differently in different versions of perl. I wonder whether perlcritic has sufficiently deep understanding of Perl code that it could find these hazards. I already checked, and found that there's no built-in filter for this (at least not in the perlcritic version I have), but maybe we could write one? The rules seem to be plug-in modules, so you can make your own in principle. Yeah, that was my thought too. I'd start with ProhibitComplexRegexes.pm as a template. If nobody else does it I'll have a go, but it might take a while. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Jelte Fennema-Nio writes: > On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera wrote: >>> Hmm, buildfarm member kestrel (which uses >>> -fsanitize=undefined,alignment) failed: >> Hm, I tried using the same compile flags, couldn't reproduce. > Okay, it passed now it seems so I guess this test is flaky somehow. Two more intermittent failures: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushmaster&dt=2024-03-13%2003%3A15%3A09 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=taipan&dt=2024-03-13%2003%3A15%3A31 These animals all belong to Andres' flotilla, but other than that I'm not seeing a connection. I suspect it's basically just a timing dependency. Have you thought about the fact that a cancel request is a no-op if it arrives after the query's done? regards, tom lane
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Tue, Mar 12, 2024 at 8:55 PM Bharath Rupireddy wrote: > > On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila wrote: > > > > > Hm. I get the concern. Are you okay with having inavlidation_reason > > > separately for both logical and physical slots? In such a case, > > > logical slots that got invalidated on the standby will have duplicate > > > info in conflict_reason and invalidation_reason, is this fine? > > > > > > > If we have duplicate information in two columns that could be > > confusing for users. BTW, isn't the recovery conflict occur only > > because of rows_removed and wal_level_insufficient reasons? The > > wal_removed or the new reasons you are proposing can't happen because > > of recovery conflict. Am, I missing something here? > > My understanding aligns with yours that the rows_removed and > wal_level_insufficient invalidations can occur only upon recovery > conflict. > > FWIW, a test named 'synchronized slot has been invalidated' in > 040_standby_failover_slots_sync.pl inappropriately uses > conflict_reason = 'wal_removed' logical slot on standby. As per the > above understanding, it's inappropriate to use conflict_reason here > because wal_removed invalidation doesn't conflict with recovery. > > > > Another idea is to make 'conflict_reason text' as a 'conflicting > > > boolean' again (revert 007693f2a3), and have 'invalidation_reason > > > text' for both logical and physical slots. So, whenever 'conflicting' > > > is true, one can look at invalidation_reason for the reason for > > > conflict. How does this sound? > > > > > > > So, does this mean that conflicting will only be true for some of the > > reasons (say wal_level_insufficient, rows_removed, wal_removed) and > > logical slots but not for others? I think that will also not eliminate > > the duplicate information as user could have deduced that from single > > column. > > So, how about we turn conflict_reason to only report the reasons that > actually cause conflict with recovery for logical slots, something > like below, and then have invalidation_cause as a generic column for > all sorts of invalidation reasons for both logical and physical slots? > If our above understanding is correct then coflict_reason will be a subset of invalidation_reason. If so, whatever way we arrange this information, there will be some sort of duplicity unless we just have one column 'invalidation_reason' and update the docs to interpret it correctly for conflicts. -- With Regards, Amit Kapila.
Re: Race condition in FetchTableStates() breaks synchronization of subscription tables
On Tue, 12 Mar 2024 at 09:34, Ajin Cherian wrote: > > > > On Tue, Mar 12, 2024 at 2:59 PM vignesh C wrote: >> >> >> Thanks, I have created the following Commitfest entry for this: >> https://commitfest.postgresql.org/47/4816/ >> >> Regards, >> Vignesh > > > Thanks for the patch, I have verified that the fix works well by following > the steps mentioned to reproduce the problem. > Reviewing the patch, it seems good and is well documented. Just one minor > comment I had was probably to change the name of the variable > table_states_valid to table_states_validity. The current name made sense when > it was a bool, but now that it is a tri-state enum, it doesn't fit well. Thanks for reviewing the patch, the attached v6 patch has the changes for the same. Regards, Vignesh From ab68e343b4b3f7ec09f758089a9f90a16f21de42 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Mon, 5 Feb 2024 14:55:48 +0530 Subject: [PATCH v6 2/2] Apply worker will get started after 180 seconds by the launcher in case the apply worker exits immediately after startup. Apply worker was getting started after 180 seconds tiemout of the launcher in case the apply worker exits immediately after startup. This was happening because the launcher process's latch was getting reset after the apply worker was started, which resulted in the launcher to wait for the next 180 seconds timeout before starting the apply worker. Fixed this issue by not resetting the latch, as this latch will be set for subscription modifications and apply worker exit. We should check if the new worker needs to be started or not and reset the latch in ApplyLauncherMain. --- src/backend/replication/logical/launcher.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c index 66070e9131..f2545482c8 100644 --- a/src/backend/replication/logical/launcher.c +++ b/src/backend/replication/logical/launcher.c @@ -185,7 +185,6 @@ WaitForReplicationWorkerAttach(LogicalRepWorker *worker, BackgroundWorkerHandle *handle) { BgwHandleStatus status; - int rc; for (;;) { @@ -220,16 +219,14 @@ WaitForReplicationWorkerAttach(LogicalRepWorker *worker, /* * We need timeout because we generally don't get notified via latch * about the worker attach. But we don't expect to have to wait long. + * Since this latch is also used for subscription creation/modification + * and apply worker process exit signal handling, the latch must not be + * reset here. We should check if the new worker needs to be started or + * not and reset the latch in ApplyLauncherMain. */ - rc = WaitLatch(MyLatch, - WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, - 10L, WAIT_EVENT_BGWORKER_STARTUP); - - if (rc & WL_LATCH_SET) - { - ResetLatch(MyLatch); - CHECK_FOR_INTERRUPTS(); - } + (void) WaitLatch(MyLatch, + WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + 10L, WAIT_EVENT_BGWORKER_STARTUP); } } -- 2.34.1 From 47c6cd886902693074ece4b2a1188dce6f066bf8 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Thu, 1 Feb 2024 09:46:40 +0530 Subject: [PATCH v6 1/2] Table sync missed for newly added tables because subscription relation cache invalidation was not handled in certain concurrent scenarios. Table sync was missed if the invalidation of table sync occurs while the non ready tables were getting prepared. This occurrs because the table state was being set to valid at the end of non ready table list preparation irrespective of any inavlidations occurred while preparing the list. Fixed it by changing the boolean variable to an tri-state enum and by setting table state to valid only if no invalidations have been occurred while the list is being prepared. --- src/backend/replication/logical/tablesync.c | 25 + src/tools/pgindent/typedefs.list| 1 + 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index ee06629088..eefe620e5e 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -123,7 +123,14 @@ #include "utils/syscache.h" #include "utils/usercontext.h" -static bool table_states_valid = false; +typedef enum +{ + SYNC_TABLE_STATE_NEEDS_REBUILD, + SYNC_TABLE_STATE_REBUILD_STARTED, + SYNC_TABLE_STATE_VALID, +} SyncingTablesState; + +static SyncingTablesState table_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD; static List *table_states_not_ready = NIL; static bool FetchTableStates(bool *started_tx); @@ -273,7 +280,7 @@ wait_for_worker_state_change(char expected_state) void invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue) { - table_states_valid = false; + table_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD; } /* @@ -1568,13 +1575,15 @@ FetchTableStates(bool *started_tx) *started_tx = false; - if
Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
I did some light editing to prepare this for commit. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 38c78bd92a7b4d4600e6f0dbe58283c21ea87d50 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 12 Mar 2024 22:04:25 -0500 Subject: [PATCH v10 1/1] Add pg_column_toast_chunk_id(). This function returns the chunk_id of an on-disk TOASTed value, or NULL if the value is un-TOASTed or not on-disk. This is useful for identifying which values are actually TOASTed and for investigating "unexpected chunk number" errors. XXX: REQUIRES CATVERSION BUMP Author: Yugo Nagata Reviewed-by: Jian He Discussion: https://postgr.es/m/20230329105507.d764497456eeac1ca491b5bd%40sraoss.co.jp --- doc/src/sgml/func.sgml | 17 src/backend/utils/adt/varlena.c | 41 src/include/catalog/pg_proc.dat | 3 ++ src/test/regress/expected/misc_functions.out | 16 src/test/regress/sql/misc_functions.sql | 12 ++ 5 files changed, 89 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0bb7aeb40e..fe9c1a38b9 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -28552,6 +28552,23 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset + + + + pg_column_toast_chunk_id + +pg_column_toast_chunk_id ( "any" ) +oid + + +Shows the chunk_id of an on-disk +TOASTed value. Returns NULL +if the value is un-TOASTed or not on-disk. See + for more information about +TOAST. + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 543afb66e5..8d28dd42ce 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -5105,6 +5105,47 @@ pg_column_compression(PG_FUNCTION_ARGS) PG_RETURN_TEXT_P(cstring_to_text(result)); } +/* + * Return the chunk_id of the on-disk TOASTed value. Return NULL if the value + * is un-TOASTed or not on-disk. + */ +Datum +pg_column_toast_chunk_id(PG_FUNCTION_ARGS) +{ + int typlen; + struct varlena *attr; + struct varatt_external toast_pointer; + + /* On first call, get the input type's typlen, and save at *fn_extra */ + if (fcinfo->flinfo->fn_extra == NULL) + { + /* Lookup the datatype of the supplied argument */ + Oid argtypeid = get_fn_expr_argtype(fcinfo->flinfo, 0); + + typlen = get_typlen(argtypeid); + if (typlen == 0) /* should not happen */ + elog(ERROR, "cache lookup failed for type %u", argtypeid); + + fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, + sizeof(int)); + *((int *) fcinfo->flinfo->fn_extra) = typlen; + } + else + typlen = *((int *) fcinfo->flinfo->fn_extra); + + if (typlen != -1) + PG_RETURN_NULL(); + + attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0)); + + if (!VARATT_IS_EXTERNAL_ONDISK(attr)) + PG_RETURN_NULL(); + + VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr); + + PG_RETURN_OID(toast_pointer.va_valueid); +} + /* * string_agg - Concatenates values and returns string. * diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 291ed876fc..443ca854a6 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -7447,6 +7447,9 @@ { oid => '2121', descr => 'compression method for the compressed datum', proname => 'pg_column_compression', provolatile => 's', prorettype => 'text', proargtypes => 'any', prosrc => 'pg_column_compression' }, +{ oid => '8393', descr => 'chunk ID of on-disk TOASTed value', + proname => 'pg_column_toast_chunk_id', provolatile => 's', prorettype => 'oid', + proargtypes => 'any', prosrc => 'pg_column_toast_chunk_id' }, { oid => '2322', descr => 'total disk space usage for the specified tablespace', proname => 'pg_tablespace_size', provolatile => 'v', prorettype => 'int8', diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out index d5f61dfad9..e0ba9fdafa 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -703,3 +703,19 @@ SELECT has_function_privilege('regress_current_logfile', (1 row) DROP ROLE regress_current_logfile; +-- pg_column_toast_chunk_id +CREATE TABLE test_chunk_id (a TEXT, b TEXT STORAGE EXTERNAL); +INSERT INTO test_chunk_id VALUES ('x', repeat('x', 8192)); +SELECT t.relname AS toastrel FROM pg_class c + LEFT JOIN pg_class t ON c.reltoastrelid = t.oid + WHERE c.relname = 'test_chunk_id' +\gset +SELECT pg_column_toast_chunk_id(a) IS NULL, + pg_column_toast_chunk_id(b) IN (SELECT chunk_id FROM pg_toast.:toastrel) + FROM test_chunk_id; + ?column? | ?column? +--+-- + t| t +(1 row) + +DROP TABLE test_chunk_id; diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_f
Re: Infinite loop in XLogPageRead() on standby
At Mon, 11 Mar 2024 16:43:32 +0900 (JST), Kyotaro Horiguchi wrote in > Oh, I once saw the fix work, but seems not to be working after some > point. The new issue was a corruption of received WAL records on the > first standby, and it may be related to the setting. I identified the cause of the second issue. When I tried to replay the issue, the second standby accidentally received the old timeline's last page-spanning record till the end while the first standby was promoting (but it had not been read by recovery). In addition to that, on the second standby, there's a time window where the timeline increased but the first segment of the new timeline is not available yet. In this case, the second standby successfully reads the page-spanning record in the old timeline even after the second standby noticed that the timeline ID has been increased, thanks to the robustness of XLogFileReadAnyTLI(). I think the primary change to XLogPageRead that I suggested is correct (assuming the use of wal_segment_size instead of the constant). However, still XLogFileReadAnyTLI() has a chance to read the segment from the old timeline after the second standby notices a timeline switch, leading to the second issue. The second issue was fixed by preventing XLogFileReadAnyTLI from reading segments from older timelines than those suggested by the latest timeline history. (In other words, disabling the "AnyTLI" part). I recall that there was a discussion for commit 4bd0ad9e44, about the objective of allowing reading segments from older timelines than the timeline history suggests. In my faint memory, we concluded to postpone making the decision to remove the feature due to uncertainity about the objective. If there's no clear reason to continue using XLogFileReadAnyTLI(), I suggest we stop its use and instead adopt XLogFileReadOnTLHistory(), which reads segments that align precisely with the timeline history. Of course, regardless of the changes above, if recovery on the second standby had reached the end of the page-spanning record before redirection to the first standby, it would need pg_rewind to connect to the first standby. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Add new error_action COPY ON_ERROR "log"
On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier wrote: > > On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote: > > +1. But, do you want to add them now or later as a separate > > patch/discussion altogether? > > The attached 0003 is what I had in mind: > - Simplification of the LOG generated with quotes applied around the > column name, don't see much need to add the relation name, either, for > consistency and because the knowledge is known in the query. > - A few more tests. > - Some doc changes. LGMT. So, I've merged those changes into 0001 and 0002. > >> Wouldn't it be better to squash the patches together, by the way? > > > > I guess not. They are two different things IMV. > > Well, 0001 is sitting doing nothing because the COPY code does not > make use of it internally. Yes. That's why I left a note in the commit message that a future commit will use it. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 77b9f28121d6531f40d96f7c00ecdb860550b67f Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 13 Mar 2024 02:20:42 + Subject: [PATCH v7 1/2] Add LOG_VERBOSITY option to COPY command This commit adds a new option LOG_VERBOSITY to set the verbosity of logged messages by COPY command. A value of 'verbose' can be used to emit more informative messages by the command, while the value of 'default (which is the default) can be used to not log any additional messages. More values such as 'terse', 'row_details' etc. can be added based on the need to the LOG_VERBOSITY option. An upcoming commit for emitting more info on soft errors by COPY FROM command with ON_ERROR 'ignore' uses this. Author: Bharath Rupireddy Reviewed-by: Michael Paquier, Masahiko Sawada Reviewed-by: Atsushi Torikoshi Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com --- doc/src/sgml/ref/copy.sgml | 14 +++ src/backend/commands/copy.c | 38 + src/bin/psql/tab-complete.c | 6 - src/include/commands/copy.h | 10 src/test/regress/expected/copy2.out | 8 ++ src/test/regress/sql/copy2.sql | 2 ++ src/tools/pgindent/typedefs.list| 1 + 7 files changed, 78 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 55764fc1f2..eba9b8f64e 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -45,6 +45,7 @@ COPY { table_name [ ( column_name [, ...] ) | * } ON_ERROR 'error_action' ENCODING 'encoding_name' +LOG_VERBOSITY [ mode ] @@ -415,6 +416,19 @@ COPY { table_name [ ( + +LOG_VERBOSITY + + + Sets the verbosity of some of the messages logged by a + COPY command. + A mode value of + verbose can be used to emit more informative messages. + default will not log any additional messages. + + + + WHERE diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 056b6733c8..23eb8c9c79 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from) return COPY_ON_ERROR_STOP; /* keep compiler quiet */ } +/* + * Extract a CopyLogVerbosityChoice value from a DefElem. + */ +static CopyLogVerbosityChoice +defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate) +{ + char *sval; + + /* + * If no parameter value given, assume the default value. + */ + if (def->arg == NULL) + return COPY_LOG_VERBOSITY_DEFAULT; + + /* + * Allow "default", or "verbose" values. + */ + sval = defGetString(def); + if (pg_strcasecmp(sval, "default") == 0) + return COPY_LOG_VERBOSITY_DEFAULT; + if (pg_strcasecmp(sval, "verbose") == 0) + return COPY_LOG_VERBOSITY_VERBOSE; + + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval), + parser_errposition(pstate, def->location))); + return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */ +} + /* * Process the statement option list for COPY. * @@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate, bool freeze_specified = false; bool header_specified = false; bool on_error_specified = false; + bool log_verbosity_specified = false; ListCell *option; /* Support external use for option sanity checking */ @@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate, on_error_specified = true; opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from); } + else if (strcmp(defel->defname, "log_verbosity") == 0) + { + if (log_verbosity_specified) +errorConflictingDefElem(defel, pstate); + log_verbosity_specified = true; + opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate); + } else ereport(ERROR, (errcode(ERRCODE_
Re: Improve eviction algorithm in ReorderBuffer
On Wed, Mar 13, 2024 at 12:48 PM Masahiko Sawada wrote: > > On Wed, Mar 13, 2024 at 10:15 AM Peter Smith wrote: > > > > On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith wrote: > > > > > > ... > > > > > > 5. > > > > > > + * > > > > > > + * If 'indexed' is true, we create a hash table to track of each > > > > > > node's > > > > > > + * index in the heap, enabling to perform some operations such as > > > > > > removing > > > > > > + * the node from the heap. > > > > > > */ > > > > > > binaryheap * > > > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > void *arg) > > > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > + bool indexed, void *arg) > > > > > > > > > > > > BEFORE > > > > > > ... enabling to perform some operations such as removing the node > > > > > > from the heap. > > > > > > > > > > > > SUGGESTION > > > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > > > have an assertion: > > > > > > > > > > void > > > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > > > { > > > > > bh_nodeidx_entry *ent; > > > > > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > > > Assert(heap->bh_indexed); > > > > > > > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > > > such as removing the node", but binaryheap_remove_node() also removes > > > > a node from the heap. So I still felt the comment wording of the patch > > > > is not quite correct. > > > > > > Now I understand your point. That's a valid point. > > > > > > > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > > > using binaryheap_remove_node_ptr() then: > > > > - the other removal functions (binaryheap_remove_*) probably need some > > > > comments to make sure nobody is tempted to call them directly for an > > > > indexed heap. > > > > - maybe some refactoring and assertions are needed to ensure those > > > > *cannot* be called directly for an indexed heap. > > > > > > > > > > If the 'index' is true, the caller can not only use the existing > > > functions but also newly added functions such as > > > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > > > something like below? > > > > > > > You said: "can not only use the existing functions but also..." > > > > Hmm. Is that right? IIUC those existing "remove" functions should NOT > > be called directly if the heap was "indexed" because they'll delete > > the node from the heap OK, but any corresponding index for that > > deleted node will be left lying around -- i.e. everything gets out of > > sync. This was the reason for my original concern. > > > > All existing binaryheap functions should be available even if the > binaryheap is 'indexed'. For instance, with the patch, > binaryheap_remote_node() is: > > void > binaryheap_remove_node(binaryheap *heap, int n) > { > int cmp; > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > Assert(n >= 0 && n < heap->bh_size); > > /* compare last node to the one that is being removed */ > cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size], >heap->bh_nodes[n], >heap->bh_arg); > > /* remove the last node, placing it in the vacated entry */ > replace_node(heap, n, heap->bh_nodes[heap->bh_size]); > > /* sift as needed to preserve the heap property */ > if (cmp > 0) > sift_up(heap, n); > else if (cmp < 0) > sift_down(heap, n); > } > > The replace_node(), sift_up() and sift_down() update node's index as > well if the binaryheap is indexed. When deleting the node from the > binaryheap, it will also delete its index from the hash table. > I see now. Thanks for the information. ~~~ Some more review comments for v8-0002 == 1. +/* + * Remove the node's index from the hash table if the heap is indexed. + */ +static bool +delete_nodeidx(binaryheap *heap, bh_node_type node) +{ + if (!binaryheap_indexed(heap)) + return false; + + return bh_nodeidx_delete(heap->bh_nodeidx, node); +} I wasn't sure if having this function was a good idea. Yes, it makes code more readable, but I felt the heap code ought to be as efficient as possible so maybe it is better for the index check to be done at the caller, instead of incurring any overhead of function calls that might do nothing. SUGGESTION if (binaryheap_indexed(heap)) found = bh_nodeidx_delete(heap->bh_nodeidx, node); ~~~ 2. +/* + * binaryheap_update_up + * + * Sift the given node up after the node's key is updated. The caller must + * ensure that the given node is in the heap. O(log n) worst case. + * + * This function can be used only if the heap is indexed. + */ +void
Re: CI speed improvements for FreeBSD
On Wed, Mar 13, 2024 at 4:50 AM Maxim Orlov wrote: > I looked at the changes and I liked them. Here are my thoughts: Thanks for looking! Pushed.
Re: Improve eviction algorithm in ReorderBuffer
On Wed, Mar 13, 2024 at 10:15 AM Peter Smith wrote: > > On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada wrote: > > > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith wrote: > > > > ... > > > > > 5. > > > > > + * > > > > > + * If 'indexed' is true, we create a hash table to track of each > > > > > node's > > > > > + * index in the heap, enabling to perform some operations such as > > > > > removing > > > > > + * the node from the heap. > > > > > */ > > > > > binaryheap * > > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > void *arg) > > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > + bool indexed, void *arg) > > > > > > > > > > BEFORE > > > > > ... enabling to perform some operations such as removing the node > > > > > from the heap. > > > > > > > > > > SUGGESTION > > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > > have an assertion: > > > > > > > > void > > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > > { > > > > bh_nodeidx_entry *ent; > > > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > > Assert(heap->bh_indexed); > > > > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > > such as removing the node", but binaryheap_remove_node() also removes > > > a node from the heap. So I still felt the comment wording of the patch > > > is not quite correct. > > > > Now I understand your point. That's a valid point. > > > > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > > using binaryheap_remove_node_ptr() then: > > > - the other removal functions (binaryheap_remove_*) probably need some > > > comments to make sure nobody is tempted to call them directly for an > > > indexed heap. > > > - maybe some refactoring and assertions are needed to ensure those > > > *cannot* be called directly for an indexed heap. > > > > > > > If the 'index' is true, the caller can not only use the existing > > functions but also newly added functions such as > > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > > something like below? > > > > You said: "can not only use the existing functions but also..." > > Hmm. Is that right? IIUC those existing "remove" functions should NOT > be called directly if the heap was "indexed" because they'll delete > the node from the heap OK, but any corresponding index for that > deleted node will be left lying around -- i.e. everything gets out of > sync. This was the reason for my original concern. > All existing binaryheap functions should be available even if the binaryheap is 'indexed'. For instance, with the patch, binaryheap_remote_node() is: void binaryheap_remove_node(binaryheap *heap, int n) { int cmp; Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); Assert(n >= 0 && n < heap->bh_size); /* compare last node to the one that is being removed */ cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size], heap->bh_nodes[n], heap->bh_arg); /* remove the last node, placing it in the vacated entry */ replace_node(heap, n, heap->bh_nodes[heap->bh_size]); /* sift as needed to preserve the heap property */ if (cmp > 0) sift_up(heap, n); else if (cmp < 0) sift_down(heap, n); } The replace_node(), sift_up() and sift_down() update node's index as well if the binaryheap is indexed. When deleting the node from the binaryheap, it will also delete its index from the hash table. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: [PoC] Improve dead tuple storage for lazy vacuum
On Tue, Mar 12, 2024 at 7:34 PM John Naylor wrote: > > On Mon, Mar 11, 2024 at 3:13 PM Masahiko Sawada wrote: > > > > On Mon, Mar 11, 2024 at 12:20 PM John Naylor > > wrote: > > > > > > On Thu, Mar 7, 2024 at 10:35 PM Masahiko Sawada > > > wrote: > > > > + ts->context = CurrentMemoryContext; > > > > > > As far as I can tell, this member is never accessed again -- am I > > > missing something? > > > > You're right. It was used to re-create the tidstore in the same > > context again while resetting it, but we no longer support the reset > > API. Considering it again, would it be better to allocate the iterator > > struct in the same context as we store the tidstore struct? > > That makes sense. > > > > + /* DSA for tidstore will be detached at the end of session */ > > > > > > No other test module pins the mapping, but that doesn't necessarily > > > mean it's wrong. Is there some advantage over explicitly detaching? > > > > One small benefit of not explicitly detaching dsa_area in > > tidstore_destroy() would be simplicity; IIUC if we want to do that, we > > need to remember the dsa_area using (for example) a static variable, > > and free it if it's non-NULL. I've implemented this idea in the > > attached patch. > > Okay, I don't have a strong preference at this point. I'd keep the update on that. > > > > +-- Add tids in random order. > > > > > > I don't see any randomization here. I do remember adding row_number to > > > remove whitespace in the output, but I don't remember a random order. > > > On that subject, the row_number was an easy trick to avoid extra > > > whitespace, but maybe we should just teach the setting function to > > > return blocknumber rather than null? > > > > Good idea, fixed. > > + test_set_block_offsets > + > + 2147483647 > + 0 > + 4294967294 > + 1 > + 4294967295 > > Hmm, was the earlier comment about randomness referring to this? I'm > not sure what other regression tests do in these cases, or how > relibale this is. If this is a problem we could simply insert this > result into a temp table so it's not output. I didn't address the comment about randomness. I think that we will have both random TIDs tests and fixed TIDs tests in test_tidstore as we discussed, and probably we can do both tests with similar steps; insert TIDs into both a temp table and tidstore and check if the tidstore returned the results as expected by comparing the results to the temp table. Probably we can have a common pl/pgsql function that checks that and raises a WARNING or an ERROR. Given that this is very similar to what we did in test_radixtree, why do we really want to implement it using a pl/pgsql function? When we discussed it before, I found the current way makes sense. But given that we're adding more tests and will add more tests in the future, doing the tests in C will be more maintainable and faster. Also, I think we can do the debug-build array stuff in the test_tidstore code instead. > > > > +Datum > > > +tidstore_create(PG_FUNCTION_ARGS) > > > +{ > > > ... > > > + tidstore = TidStoreCreate(max_bytes, dsa); > > > > > > +Datum > > > +tidstore_set_block_offsets(PG_FUNCTION_ARGS) > > > +{ > > > > > > + TidStoreSetBlockOffsets(tidstore, blkno, offs, noffs); > > > > > > These names are too similar. Maybe the test module should do > > > s/tidstore_/test_/ or similar. > > > > Agreed. > > Mostly okay, although a couple look a bit generic now. I'll leave it > up to you if you want to tweak things. > > > > In general, the .sql file is still very hard-coded. Functions are > > > created that contain a VALUES statement. Maybe it's okay for now, but > > > wanted to mention it. Ideally, we'd have some randomized tests, > > > without having to display it. That could be in addition to (not > > > replacing) the small tests we have that display input. (see below) > > > > > > > Agreed to add randomized tests in addition to the existing tests. > > I'll try something tomorrow. > > > Sounds a good idea. In fact, if there are some bugs in tidstore, it's > > likely that even initdb would fail in practice. However, it's a very > > good idea that we can test the tidstore anyway with such a check > > without a debug-build array. > > > > Or as another idea, I wonder if we could keep the debug-build array in > > some form. For example, we use the array with the particular build > > flag and set a BF animal for that. That way, we can test the tidstore > > in more real cases. > > I think the purpose of a debug flag is to help developers catch > mistakes. I don't think it's quite useful enough for that. For one, it > has the same 1GB limitation as vacuum's current array. For another, > it'd be a terrible way to debug moving tidbitmap.c from its hash table > to use TID store -- AND/OR operations and lossy pages are pretty much > undoable with a copy of vacuum's array. Valid points. As I mentioned above, if we im
Re: Improve eviction algorithm in ReorderBuffer
On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada wrote: > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith wrote: > > ... > > > > 5. > > > > + * > > > > + * If 'indexed' is true, we create a hash table to track of each node's > > > > + * index in the heap, enabling to perform some operations such as > > > > removing > > > > + * the node from the heap. > > > > */ > > > > binaryheap * > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void > > > > *arg) > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > + bool indexed, void *arg) > > > > > > > > BEFORE > > > > ... enabling to perform some operations such as removing the node from > > > > the heap. > > > > > > > > SUGGESTION > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > have an assertion: > > > > > > void > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > { > > > bh_nodeidx_entry *ent; > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > Assert(heap->bh_indexed); > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > such as removing the node", but binaryheap_remove_node() also removes > > a node from the heap. So I still felt the comment wording of the patch > > is not quite correct. > > Now I understand your point. That's a valid point. > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > using binaryheap_remove_node_ptr() then: > > - the other removal functions (binaryheap_remove_*) probably need some > > comments to make sure nobody is tempted to call them directly for an > > indexed heap. > > - maybe some refactoring and assertions are needed to ensure those > > *cannot* be called directly for an indexed heap. > > > > If the 'index' is true, the caller can not only use the existing > functions but also newly added functions such as > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > something like below? > You said: "can not only use the existing functions but also..." Hmm. Is that right? IIUC those existing "remove" functions should NOT be called directly if the heap was "indexed" because they'll delete the node from the heap OK, but any corresponding index for that deleted node will be left lying around -- i.e. everything gets out of sync. This was the reason for my original concern. > * If 'indexed' is true, we create a hash table to track each node's > * index in the heap, enabling to perform some operations such as > * binaryheap_remove_node_ptr() etc. > Yeah, something like that... I'll wait for the next patch version before commenting further. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Support json_errdetail in FRONTEND builds
On 2024-03-12 Tu 14:43, Jacob Champion wrote: Hello, Both the incremental JSON [1] and OAuth [2] patchsets would be improved by json_errdetail(), which was removed from FRONTEND builds in b44669b2ca: The routine providing a detailed error message based on the error code is made backend-only, the existing code being unsafe to use in the frontend as the error message may finish by being palloc'd or point to a static string, so there is no way to know if the memory of the message should be pfree'd or not. Attached is a patch to undo this, by attaching any necessary allocations to the JsonLexContext so the caller doesn't have to keep track of them. This is based on the first patch of the OAuth patchset, which additionally needs json_errdetail() to be safe to use from libpq itself. Alvaro pointed out offlist that we don't have to go that far to re-enable this function for the utilities, so this patch is a sort of middle ground between what we have now and what OAuth implements. (There is some additional minimization that could be done to this patch, but I'm hoping to keep the code structure consistent between the two, if the result is acceptable.) Seems reasonable. Two notes that I wanted to point out explicitly: - On the other thread, Daniel contributed a destroyStringInfo() counterpart for makeStringInfo(), which is optional but seemed useful to include here. yeah, although maybe worth a different patch. - After this patch, if a frontend client calls json_errdetail() without calling freeJsonLexContext(), it will leak memory. Not too concerned about this: 1. we tend to be a bit more relaxed about that in frontend programs, especially those not expected to run for long times and especially on error paths, where in many cases the program will just exit anyway. 2. the fix is simple where it's needed. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PROPOSAL] Skip test citext_utf8 on Windows
On 2024-03-11 Mo 22:50, Thomas Munro wrote: On Tue, Mar 12, 2024 at 2:56 PM Andrew Dunstan wrote: On 2024-03-11 Mo 04:21, Oleg Tselebrovskiy wrote: Greetings, everyone! While running "installchecks" on databases with UTF-8 encoding the test citext_utf8 fails because of Turkish dotted I like this: SELECT 'i'::citext = 'İ'::citext AS t; t --- - t + f (1 row) I tried to replicate the test's results by hand and with any collation that I tried (including --locale="Turkish") this test failed Also an interesing result of my tesing. If you initialize you DB with -E utf-8 --locale="Turkish" and then run select LOWER('İ'); the output will be this: lower --- İ (1 row) Which I find strange since lower() uses collation that was passed (default in this case but still) Wouldn't we be better off finding a Windows fix for this, instead of sweeping it under the rug? Given the sorry state of our Windows locale support, I've started wondering about deleting it and telling users to adopt our nascent built-in support or ICU[1]. This other thread [2] says the sorting is intransitive so I don't think it really meets our needs anyway. [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGJhV__g_TJ0jVqPbnTuqT%2B%2BM6KFv2wj%2B9AV-cABNCXN6Q%40mail.gmail.com#bc35c0b88962ff8c24c27aecc1bca72e [2] https://www.postgresql.org/message-id/flat/1407a2c0-062b-4e4c-b728-438fdff5cb07%40manitou-mail.org Makes more sense than just hacking the tests to avoid running them on Windows. (I also didn't much like doing it by parsing the version string, although I know there's at least one precedent for doing that.) cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Transaction timeout
On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin wrote: > > On 11 Mar 2024, at 16:18, Alexander Korotkov wrote: > > > > I think if checking psql stderr is problematic, checking just logs is > > fine. Could we wait for the relevant log messages one by one with > > $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do? > > PFA version with $node->wait_for_log() I've slightly revised the patch. I'm going to push it if no objections. -- Regards, Alexander Korotkov v5-0001-Add-TAP-tests-for-timeouts.patch Description: Binary data
Re: Add basic tests for the low-level backup method.
On 2/29/24 16:55, Michael Paquier wrote: On Thu, Feb 29, 2024 at 10:30:52AM +1300, David Steele wrote: On 11/12/23 08:21, David Steele wrote: As promised in [1], attached are some basic tests for the low-level backup method. Added to the 2024-03 CF. There is already 040_standby_failover_slots_sync.pl in recovery/ that uses the number of your test script. You may want to bump it, that's a nit. Renamed to 042_low_level_backup.pl. +unlink("$backup_dir/postmaster.pid") + or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid"); +unlink("$backup_dir/postmaster.opts") + or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts"); +unlink("$backup_dir/global/pg_control") + or BAIL_OUT("unable to unlink $backup_dir/global/pg_control"); RELCACHE_INIT_FILENAME as well? I'm not trying to implement the full exclusion list here, just enough to get the test working. Since exclusions are optional according to the docs I don't think we need them for a valid tests. +# Rather than writing out backup_label, try to recover the backup without +# backup_label to demonstrate that recovery will not work correctly without it, +# i.e. the canary table will be missing and the cluster will be corrupt. Provide +# only the WAL segment that recovery will think it needs. Okay, why not. No objections to this addition. I am a bit surprised that this is not scanning some of the logs produced by the startup process for particular patterns. Not sure what to look for here. There are no distinct messages for crash recovery. Perhaps there should be? +# Save backup_label into the backup directory and recover using the primary's +# archive. This time recovery will succeed and the canary table will be +# present. Here are well, I think that we should add some log_contains() with pre-defined patterns to show that recovery has completed the way we want it with a backup_label up to the end-of-backup record. Sure, I added a check for the new log message when recovering with a backup_label. Regards, -DavidFrom 076429a578ece3c213525220a4961cb5b56c190a Mon Sep 17 00:00:00 2001 From: David Steele Date: Wed, 13 Mar 2024 00:01:55 + Subject: Add basic tests for the low-level backup method. There are currently no tests for the low-level backup method. pg_backup_start() and pg_backup_stop() are called but not exercised in any real fashion. There is a lot more that can be done, but this at least provides some basic tests and provides a place for future improvement. --- src/test/recovery/t/042_low_level_backup.pl | 108 1 file changed, 108 insertions(+) create mode 100644 src/test/recovery/t/042_low_level_backup.pl diff --git a/src/test/recovery/t/042_low_level_backup.pl b/src/test/recovery/t/042_low_level_backup.pl new file mode 100644 index 00..22668bdc0d --- /dev/null +++ b/src/test/recovery/t/042_low_level_backup.pl @@ -0,0 +1,108 @@ + +# Copyright (c) 2023, PostgreSQL Global Development Group + +# Test low-level backup method by using pg_backup_start() and pg_backup_stop() +# to create backups. + +use strict; +use warnings; + +use File::Copy qw(copy); +use File::Path qw(remove_tree); +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +# Start primary node with archiving +my $node_primary = PostgreSQL::Test::Cluster->new('primary'); +$node_primary->init(has_archiving => 1, allows_streaming => 1); +$node_primary->start; + +# Start backup +my $backup_name = 'backup1'; +my $psql = $node_primary->background_psql('postgres'); + +$psql->query_safe("SET client_min_messages TO WARNING"); +$psql->set_query_timer_restart(); +$psql->query_safe("select pg_backup_start('test label')"); + +# Copy files +my $backup_dir = $node_primary->backup_dir() . '/' . $backup_name; + +PostgreSQL::Test::RecursiveCopy::copypath( + $node_primary->data_dir(), $backup_dir); + +# Cleanup some files/paths that should not be in the backup. There is no +# attempt to all the exclusions done by pg_basebackup here, in part because +# they are not required, but also to keep the test simple. +# +# Also remove pg_control because it needs to be copied later and it will be +# obvious if the copy fails. +unlink("$backup_dir/postmaster.pid") + or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid"); +unlink("$backup_dir/postmaster.opts") + or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts"); +unlink("$backup_dir/global/pg_control") + or BAIL_OUT("unable to unlink $backup_dir/global/pg_control"); + +remove_tree("$backup_dir/pg_wal", {keep_root => 1}) + or BAIL_OUT("unable to unlink contents of $backup_dir/pg_wal"); + +# Create a table that will be used to verify that recovery started at the +# correct location, rather than the location recorded in pg_control +$psql->query_safe("create table canary (id int)"); + +# Advance the checkpoint location in pg_control past the location where the +# backup started. Switch the WAL to make it really ob
Re: Disable LLVM bitcode generation with pgxs.mk framework.
> On Tue, Mar 12, 2024 at 10:40 PM Daniel Gustafsson wrote: > > > On 12 Mar 2024, at 14:38, Xing Guo wrote: > > > Would it be possible to add a new switch in the pgxs.mk framework to > > allow users to disable this feature? > > Something like that doesn't seem unreasonable I think. Thanks. I added a new option NO_LLVM_BITCODE to pgxs. I'm not sure if the name is appropriate. > -- > Daniel Gustafsson > From e19a724fad4949bef9bc4d0f8e58719607d979be Mon Sep 17 00:00:00 2001 From: Xing Guo Date: Wed, 13 Mar 2024 07:56:46 +0800 Subject: [PATCH v1] Add NO_LLVM_BITCODE option to pgxs. This patch adds a new option NO_LLVM_BITCODE to pgxs to allow user to disable LLVM bitcode generation completely even if the PostgreSQL installation is configured with --with-llvm. --- doc/src/sgml/extend.sgml | 9 + src/makefiles/pgxs.mk| 6 ++ 2 files changed, 15 insertions(+) diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index 218940ee5c..6fe69746c2 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -1719,6 +1719,15 @@ include $(PGXS) + + NO_LLVM_BITCODE + + +don't generate LLVM bitcode even if the current PostgreSQL installation is configured with --with-llvm + + + + EXTRA_CLEAN diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk index 0de3737e78..ec6a3c1f09 100644 --- a/src/makefiles/pgxs.mk +++ b/src/makefiles/pgxs.mk @@ -53,6 +53,8 @@ # that don't need their build products to be installed # NO_INSTALLCHECK -- don't define an installcheck target, useful e.g. if # tests require special configuration, or don't use pg_regress +# NO_LLVM_BITCODE -- don't generate LLVM bitcode even if the current +# PostgreSQL installation is configured with --with-llvm # EXTRA_CLEAN -- extra files to remove in 'make clean' # PG_CPPFLAGS -- will be prepended to CPPFLAGS # PG_CFLAGS -- will be appended to CFLAGS @@ -218,6 +220,10 @@ endef all: $(PROGRAM) $(DATA_built) $(HEADER_allbuilt) $(SCRIPTS_built) $(addsuffix $(DLSUFFIX), $(MODULES)) $(addsuffix .control, $(EXTENSION)) +ifdef NO_LLVM_BITCODE +with_llvm := no +endif # NO_LLVM_BITCODE + ifeq ($(with_llvm), yes) all: $(addsuffix .bc, $(MODULES)) $(patsubst %.o,%.bc, $(OBJS)) endif -- 2.44.0
Re: meson: Specify -Wformat as a common warning flag for extensions
Hi, In "Re: meson: Specify -Wformat as a common warning flag for extensions" on Fri, 08 Mar 2024 10:05:27 -0600, "Tristan Partin" wrote: > Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level > to 1 in the Meson project() call, which implies -Wall, and set -Wall > in CFLAGS for autoconf. That's the reason we don't get issues building > Postgres. A user making use of the pg_config --cflags option, as Sutou > is, *will* run into the aforementioned issues, since we don't > propogate -Wall into pg_config. > > $ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null > cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’ > [-Wformat-security] > $ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null > (nothing printed) Thanks for explaining this. You're right. This is the reason why we don't need this for PostgreSQL itself but we need this for PostgreSQL extensions. Sorry. I should have explained this in the first e-mail... What should we do to proceed this patch? Thanks, -- kou
Re: Add new error_action COPY ON_ERROR "log"
On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote: > +1. But, do you want to add them now or later as a separate > patch/discussion altogether? The attached 0003 is what I had in mind: - Simplification of the LOG generated with quotes applied around the column name, don't see much need to add the relation name, either, for consistency and because the knowledge is known in the query. - A few more tests. - Some doc changes. >> Wouldn't it be better to squash the patches together, by the way? > > I guess not. They are two different things IMV. Well, 0001 is sitting doing nothing because the COPY code does not make use of it internally. -- Michael From c45474726e084faf876a319485995ce84eef8293 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Mon, 11 Mar 2024 11:37:49 + Subject: [PATCH v6 1/3] Add LOG_VERBOSITY option to COPY command This commit adds a new option LOG_VERBOSITY to set the verbosity of logged messages by COPY command. A value of 'verbose' can be used to emit more informative messages by the command, while the value of 'default (which is the default) can be used to not log any additional messages. More values such as 'terse', 'row_details' etc. can be added based on the need to the LOG_VERBOSITY option. An upcoming commit for emitting more info on soft errors by COPY FROM command with ON_ERROR 'ignore' uses this. Author: Bharath Rupireddy Reviewed-by: Michael Paquier, Masahiko Sawada Reviewed-by: Atsushi Torikoshi Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com --- src/include/commands/copy.h | 10 src/backend/commands/copy.c | 38 + src/bin/psql/tab-complete.c | 6 - src/test/regress/expected/copy2.out | 8 ++ src/test/regress/sql/copy2.sql | 2 ++ doc/src/sgml/ref/copy.sgml | 14 +++ src/tools/pgindent/typedefs.list| 1 + 7 files changed, 78 insertions(+), 1 deletion(-) diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h index b3da3cb0be..99d183fa4d 100644 --- a/src/include/commands/copy.h +++ b/src/include/commands/copy.h @@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice COPY_ON_ERROR_IGNORE, /* ignore errors */ } CopyOnErrorChoice; +/* + * Represents verbosity of logged messages by COPY command. + */ +typedef enum CopyLogVerbosityChoice +{ + COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */ + COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */ +} CopyLogVerbosityChoice; + /* * A struct to hold COPY options, in a parsed form. All of these are related * to formatting, except for 'freeze', which doesn't really belong here, but @@ -73,6 +82,7 @@ typedef struct CopyFormatOptions bool *force_null_flags; /* per-column CSV FN flags */ bool convert_selectively; /* do selective binary conversion? */ CopyOnErrorChoice on_error; /* what to do when error happened */ + CopyLogVerbosityChoice log_verbosity; /* verbosity of logged messages */ List *convert_select; /* list of column names (can be NIL) */ } CopyFormatOptions; diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 056b6733c8..23eb8c9c79 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from) return COPY_ON_ERROR_STOP; /* keep compiler quiet */ } +/* + * Extract a CopyLogVerbosityChoice value from a DefElem. + */ +static CopyLogVerbosityChoice +defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate) +{ + char *sval; + + /* + * If no parameter value given, assume the default value. + */ + if (def->arg == NULL) + return COPY_LOG_VERBOSITY_DEFAULT; + + /* + * Allow "default", or "verbose" values. + */ + sval = defGetString(def); + if (pg_strcasecmp(sval, "default") == 0) + return COPY_LOG_VERBOSITY_DEFAULT; + if (pg_strcasecmp(sval, "verbose") == 0) + return COPY_LOG_VERBOSITY_VERBOSE; + + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval), + parser_errposition(pstate, def->location))); + return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */ +} + /* * Process the statement option list for COPY. * @@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate, bool freeze_specified = false; bool header_specified = false; bool on_error_specified = false; + bool log_verbosity_specified = false; ListCell *option; /* Support external use for option sanity checking */ @@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate, on_error_specified = true; opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from); } + else if (strcmp(defel->defname, "log_verbosity") == 0) + { + if (log_verbosity_specified) +errorConflictingDefElem(defel, pstate); + log_verbosity_specified = true; + opts_out->
Re: perl: unsafe empty pattern behavior
Jeff Davis writes: > On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote: >> I also tried grepping (for things >> like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you >> have ... but I only looked for the "qr" literal, not other ways to >> get regexes. > I think that's fine. qr// seems the most dangerous, because it seems to > behave differently in different versions of perl. I wonder whether perlcritic has sufficiently deep understanding of Perl code that it could find these hazards. I already checked, and found that there's no built-in filter for this (at least not in the perlcritic version I have), but maybe we could write one? The rules seem to be plug-in modules, so you can make your own in principle. regards, tom lane
Re: perl: unsafe empty pattern behavior
On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote: > I suggest that pg_dump/t/002_pg_dump.pl could use a verification that > the ->{regexp} thing is not empty. I'm not sure how exactly to test for an empty pattern. The problem is, we don't really want to test for an empty pattern, because /(?^:)/ is fine. The problem is //, which gets turned into an actual pattern (perhaps empty or perhaps not), and by the time it's in the %tests hash, I think it's too late to distinguish. Again, I'm not a perl expert, so suggestions welcome. > I also tried grepping (for things > like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you > have ... but I only looked for the "qr" literal, not other ways to > get > regexes. I think that's fine. qr// seems the most dangerous, because it seems to behave differently in different versions of perl. Grepping for regexes in perl code is an "interesting" exercise. Regards, Jeff Davis
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera wrote: > > On 2024-Mar-12, Alvaro Herrera wrote: > > > Hmm, buildfarm member kestrel (which uses > > -fsanitize=undefined,alignment) failed: > > > > # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc > > dbname='postgres' > > test cancellations... > > libpq_pipeline:260: query did not fail when it was expected > > Hm, I tried using the same compile flags, couldn't reproduce. Okay, it passed now it seems so I guess this test is flaky somehow. The error message and the timing difference between failed and succeeded buildfarm run clearly indicates that the pg_sleep ran its 180 seconds to completion (so cancel was never processed for some reason). **failed case** 282/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline ERROR 191.56s exit status 1 **succeeded case** 252/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline OK 10.01s 21 subtests passed I don't see any obvious reason for how this test can be flaky, but I'll think a bit more about it tomorrow.
Re: Spurious pgstat_drop_replslot() call
On Tue, Mar 12, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote: > Sorry for a bit off-topic, but I've remembered an anomaly with a similar > assert: > https://www.postgresql.org/message-id/17947-b9554521ad963c9c%40postgresql.org Thanks for the reminder. The invalidation path with the stats drop is only in 16~. > Maybe you would find it worth considering while working in this area... > (I've just run that reproducer on b36fbd9f8 and confirmed that the > assertion failure is still here.) Indeed, something needs to happen. I am not surprised that it still reproduces; nothing has changed with the locking of the stats entries. :/ -- Michael signature.asc Description: PGP signature
Recent 027_streaming_regress.pl hangs
Hi, Several animals are timing out while waiting for catchup, sporadically. I don't know why. The oldest example I have found so far by clicking around is: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-02-23%2015%3A44%3A35 So perhaps something was committed ~3 weeks ago triggered this. There are many examples since, showing as recoveryCheck failures. Apparently they are all on animals wrangled by Andres. Hmm. I think some/all share a physical host, they seem to have quite high run time variance, and they're using meson.
Re: Vectored I/O in bulk_write.c
One more observation while I'm thinking about bulk_write.c... hmm, it writes the data out and asks the checkpointer to fsync it, but doesn't call smgrwriteback(). I assume that means that on Linux the physical writeback sometimes won't happen until the checkpointer eventually calls fsync() sequentially, one segment file at a time. I see that it's difficult to decide how to do that though; unlike checkpoints, which have rate control/spreading, bulk writes could more easily flood the I/O subsystem in a burst. I expect most non-Linux systems' write-behind heuristics to fire up for bulk sequential writes, but on Linux where most users live, there is no write-behind heuristic AFAIK (on the most common file systems anyway), so you have to crank that handle if you want it to wake up and smell the coffee before it hits internal limits, but then you have to decide how fast to crank it. This problem will come into closer focus when we start talking about streaming writes. For the current non-streaming bulk_write.c coding, I don't have any particular idea of what to do about that, so I'm just noting the observation here. Sorry for the sudden wall of text/monologue; this is all a sort of reaction to bulk_write.c that I should perhaps have sent to the bulk_write.c thread, triggered by a couple of debugging sessions.
Re: On disable_cost
David Rowley writes: > So maybe the fix could be to set disable_cost to something like > 1.0e110 and adjust compare_path_costs_fuzzily to not apply the > fuzz_factor for paths >= disable_cost. However, I wonder if that > risks the costs going infinite after a couple of cartesian joins. Perhaps. It still does nothing for Robert's point that once we're forced into using a "disabled" plan type, it'd be better if the disabled-ness didn't skew subsequent planning choices. On the whole I agree that getting rid of disable_cost entirely would be the way to go, if we can replace that with a separate boolean without driving up the cost of add_path too much. regards, tom lane
Re: un-revert the MAINTAIN privilege and the pg_maintain predefined role
On Thu, Mar 07, 2024 at 10:50:00AM -0600, Nathan Bossart wrote: > Given all of this code was previously reviewed and committed, I am planning > to forge ahead and commit this early next week, provided no objections or > additional feedback materialize. Jeff Davis and I spent some additional time looking at this patch. There are existing inconsistencies among the privilege checks for the various maintenance commands, and the MAINTAIN privilege just builds on the status quo, with one exception. In the v1 patch, I proposed skipping privilege checks when VACUUM recurses to TOAST tables, which means that a user may be able to process a TOAST table for which they've concurrent lost privileges on the main relation (since each table is vacuumed in a separate transaction). It's easy enough to resolve this inconsistency by sending down the parent OID when recursing to a TOAST table and using that for the privilege checks. AFAICT this avoids any kind of cache lookup hazards because we hold a session lock on the main relation in this case. I've done this in the attached v2. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 1d58f64b1dcded53bd95c926c5476e129352c753 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 12 Mar 2024 10:55:33 -0500 Subject: [PATCH v2 1/1] Reintroduce MAINTAIN privilege and pg_maintain predefined role. Roles with MAINTAIN on a relation may run VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW, CLUSTER, and LOCK TABLE on the relation. Roles with privileges of pg_maintain may run those same commands on all relations. This was previously committed for v16, but it was reverted in commit 151c22deee due to concerns about search_path tricks that could be used to escalate privileges to the table owner. Commits 2af07e2f74, 59825d1639, and c7ea3f4229 resolved these concerns by restricting search_path when running maintenance commands. XXX: NEEDS CATVERSION BUMP Reviewed-by: Jeff Davis Discussion: https://postgr.es/m/20240305161235.GA3478007%40nathanxps13 --- doc/src/sgml/ddl.sgml | 35 -- doc/src/sgml/func.sgml| 2 +- .../sgml/ref/alter_default_privileges.sgml| 4 +- doc/src/sgml/ref/analyze.sgml | 6 +- doc/src/sgml/ref/cluster.sgml | 10 +- doc/src/sgml/ref/grant.sgml | 3 +- doc/src/sgml/ref/lock.sgml| 4 +- .../sgml/ref/refresh_materialized_view.sgml | 5 +- doc/src/sgml/ref/reindex.sgml | 23 ++-- doc/src/sgml/ref/revoke.sgml | 2 +- doc/src/sgml/ref/vacuum.sgml | 6 +- doc/src/sgml/user-manag.sgml | 12 ++ src/backend/catalog/aclchk.c | 15 +++ src/backend/commands/analyze.c| 13 +- src/backend/commands/cluster.c| 43 +-- src/backend/commands/indexcmds.c | 34 ++--- src/backend/commands/lockcmds.c | 2 +- src/backend/commands/matview.c| 3 +- src/backend/commands/tablecmds.c | 18 +-- src/backend/commands/vacuum.c | 76 +++- src/backend/postmaster/autovacuum.c | 1 + src/backend/utils/adt/acl.c | 8 ++ src/bin/pg_dump/dumputils.c | 1 + src/bin/pg_dump/t/002_pg_dump.pl | 2 +- src/bin/psql/tab-complete.c | 6 +- src/include/catalog/pg_authid.dat | 5 + src/include/commands/tablecmds.h | 5 +- src/include/commands/vacuum.h | 5 +- src/include/nodes/parsenodes.h| 3 +- src/include/utils/acl.h | 5 +- .../expected/cluster-conflict-partition.out | 8 +- .../specs/cluster-conflict-partition.spec | 2 +- .../perl/PostgreSQL/Test/AdjustUpgrade.pm | 11 ++ src/test/regress/expected/cluster.out | 7 ++ src/test/regress/expected/create_index.out| 4 +- src/test/regress/expected/dependency.out | 22 ++-- src/test/regress/expected/privileges.out | 116 ++ src/test/regress/expected/rowsecurity.out | 34 ++--- src/test/regress/sql/cluster.sql | 5 + src/test/regress/sql/dependency.sql | 2 +- src/test/regress/sql/privileges.sql | 67 ++ 41 files changed, 456 insertions(+), 179 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 9d7e2c756b..8616a8e9cc 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -1868,8 +1868,8 @@ ALTER TABLE products RENAME TO items; INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, - EXECUTE, USAGE, SET - and ALTER SYSTEM. + EXECUTE, USAGE, SET, + ALTER SYSTEM, and MAINTAIN. The privileges applicable to a particular object vary depending on the object's type (table, function, etc.). More detail about the meanings
Re: On disable_cost
On Wed, 13 Mar 2024 at 08:55, Robert Haas wrote: > But in the absence of that, we need some way to privilege the > non-disabled paths over the disabled ones -- and I'd prefer to have > something more principled than disable_cost, if we can work out the > details. The primary place I see issues with disabled_cost is caused by STD_FUZZ_FACTOR. When you add 1.0e10 to a couple of modestly costly paths, it makes them appear fuzzily the same in cases where one could be significantly cheaper than the other. If we were to bump up the disable_cost it would make this problem worse. I think we do still need some way to pick the cheapest disabled path when there are no other options. One way would be to set fuzz_factor to 1.0 when either of the paths costs in compare_path_costs_fuzzily() is >= disable_cost. clamp_row_est() does cap row estimates at MAXIMUM_ROWCOUNT (1e100), so I think there is some value of disable_cost that could almost certainly ensure we don't reach it because the path is truly expensive rather than disabled. So maybe the fix could be to set disable_cost to something like 1.0e110 and adjust compare_path_costs_fuzzily to not apply the fuzz_factor for paths >= disable_cost. However, I wonder if that risks the costs going infinite after a couple of cartesian joins. David
Re: remaining sql/json patches
About 0002: I think we should just drop it. Look at the changes it produces in the plans for aliases XMLTABLE: > @@ -1556,7 +1556,7 @@ SELECT f.* FROM xmldata, LATERAL > xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU > Output: f."COUNTRY_NAME", f."REGION_ID" > -> Seq Scan on public.xmldata > Output: xmldata.data > - -> Table Function Scan on "xmltable" f > + -> Table Function Scan on "XMLTABLE" f > Output: f."COUNTRY_NAME", f."REGION_ID" > Table Function Call: XMLTABLE(('/ROWS/ROW[COUNTRY_NAME="Japan" or > COUNTRY_NAME="India"]'::text) PASSING (xmldata.data) COLUMNS "COUNTRY_NAME" > text, "REGION_ID" integer) > Filter: (f."COUNTRY_NAME" = 'Japan'::text) Here in text-format EXPLAIN, we already have the alias next to the "xmltable" moniker, when an alias is present. This matches the query itself as well as the labels used in the "Output:" display. If an alias is not present, then this says just 'Table Function Scan on "xmltable"' and the rest of the plans refers to this as "xmltable", so it's also fine. > @@ -1591,7 +1591,7 @@ SELECT f.* FROM xmldata, LATERAL > xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU > "Parent Relationship": "Inner", > > + > "Parallel Aware": false, > > + > "Async Capable": false, > > + > - "Table Function Name": "xmltable", > > + > + "Table Function Name": "XMLTABLE", > > + > "Alias": "f", > > + > "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""], > > + > "Table Function Call": > "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or > COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS > \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+ This is the JSON-format explain. Notice that the "Alias" member already shows the alias "f", so the only thing this change is doing is uppercasing the "xmltable" to "XMLTABLE". We're not really achieving anything here. I think the only salvageable piece from this, **if anything**, is making the "xmltable" literal string into uppercase. That might bring a little clarity to the fact that this is a keyword and not a user-introduced name. In your 0003 I think this would only have relevance in this query, +-- JSON_TABLE() with alias +EXPLAIN (COSTS OFF, VERBOSE) +SELECT * FROM + JSON_TABLE( + jsonb 'null', 'lax $[*]' PASSING 1 + 2 AS a, json '"foo"' AS "b c" + COLUMNS ( + id FOR ORDINALITY, + "int" int PATH '$', + "text" text PATH '$' + )) json_table_func; + QUERY PLAN +-- -- + Table Function Scan on "JSON_TABLE" json_table_func + Output: id, "int", text + Table Function Call: JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 PASSING 3 AS a, '"foo"'::jsonb AS "b c" COLUMNS (id FOR ORDINALITY, "int" integer PATH '$', text text PATH '$') PLAN (json_table_path_0)) +(3 rows) and I'm curious to see what this would output if this was to be run without the 0002 patch. If I understand things correctly, the alias would be displayed anyway, meaning 0002 doesn't get us anything. Please do add a test with EXPLAIN (FORMAT JSON) in 0003. Thanks -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "La vida es para el que se aventura"
Re: typo in paths.h
On Wed, 13 Mar 2024 at 07:58, Cary Huang wrote: > I noticed that the comment for declaring create_tidscan_paths() in > src/include/optimizer/paths.h has a typo. The function is implemented in > tidpath.c, not tidpath.h as stated, which does not exist. Thank you. Pushed. David
Re: On disable_cost
On Tue, Mar 12, 2024 at 3:36 PM Tom Lane wrote: > Yeah. I keep thinking that the right solution is to not generate > disabled paths in the first place if there are any other ways to > produce the same relation. That has obvious order-of-operations > problems though, and I've not been able to make it work. I've expressed the same view in the past. It would be nice not to waste planner effort on paths that we're just going to throw away, but I'm not entirely sure what you mean by "obvious order-of-operations problems." To me, it seems like what we'd need is to be able to restart the whole planner process if we run out of steam before we get done. For example, suppose we're planning a 2-way join where index and index-only scans are disabled, sorts are disabled, and nested loops and hash joins are disabled. There's no problem generating just the non-disabled scan types at the baserel level, but when we reach the join, we're going to find that the only non-disabled join type is a merge join, and we're also going to find that we have no paths that provide pre-sorted input, so we need to sort, which we're also not allowed to do. If we could give up at that point and restart planning, disabling all of the plan-choice constraints and now creating all paths for each RelOptInfo, then everything would, I believe, be just fine. We'd end up needing neither disable_cost nor the mechanism proposed by this patch. But in the absence of that, we need some way to privilege the non-disabled paths over the disabled ones -- and I'd prefer to have something more principled than disable_cost, if we can work out the details. -- Robert Haas EDB: http://www.enterprisedb.com
Re: On disable_cost
Robert Haas writes: > On Tue, Mar 12, 2024 at 1:32 PM Tom Lane wrote: >> BTW, having written that paragraph, I wonder if we couldn't get >> the same end result with a nearly one-line change that consists of >> making disable_cost be IEEE infinity. > I don't think so, because I think that what will happen in that case > is that we'll pick a completely random plan if we can't pick a plan > that avoids incurring disable_cost. Every plan that contains one > disabled node anywhere in the plan tree will look like it has exactly > the same cost as any other such plan. Good point. > IMHO, this is actually one of the problems with disable_cost as it > works today. Yeah. I keep thinking that the right solution is to not generate disabled paths in the first place if there are any other ways to produce the same relation. That has obvious order-of-operations problems though, and I've not been able to make it work. regards, tom lane
typo in paths.h
Hello I noticed that the comment for declaring create_tidscan_paths() in src/include/optimizer/paths.h has a typo. The function is implemented in tidpath.c, not tidpath.h as stated, which does not exist. Made a small patch to correct it. Thank you Cary Huang - HighGo Software Inc. (Canada) mailto:cary.hu...@highgo.ca http://www.highgo.ca correct_source_filename.patch Description: Binary data
Broken EXPLAIN output for SubPlan in MERGE
While playing around with EXPLAIN and SubPlans, I noticed that there's a bug in how this is handled for MERGE. For example: drop table if exists src, tgt, ref; create table src (a int, b text); create table tgt (a int, b text); create table ref (a int); explain (verbose, costs off) merge into tgt t using (select (select r.a from ref r where r.a = s.a) a, b from src s) s on t.a = s.a when not matched then insert values (s.a, s.b); QUERY PLAN --- Merge on public.tgt t -> Merge Left Join Output: t.ctid, s.a, s.b, s.ctid Merge Cond: (((SubPlan 1)) = t.a) -> Sort Output: s.a, s.b, s.ctid, ((SubPlan 1)) Sort Key: ((SubPlan 1)) -> Seq Scan on public.src s Output: s.a, s.b, s.ctid, (SubPlan 1) SubPlan 1 -> Seq Scan on public.ref r Output: r.a Filter: (r.a = s.a) -> Sort Output: t.ctid, t.a Sort Key: t.a -> Seq Scan on public.tgt t Output: t.ctid, t.a SubPlan 2 -> Seq Scan on public.ref r_1 Output: r_1.a Filter: (r_1.a = t.ctid) The final filter condition "(r_1.a = t.ctid)" is incorrect, and should be "(r_1.a = s.a)". What's happening is that the right hand side of that filter expression is an input Param node which get_parameter() tries to display by calling find_param_referent() and then drilling down through the ancestor node (the ModifyTable node) to try to find the real name of the variable (s.a). However, that isn't working properly for MERGE because the inner_plan and inner_tlist of the corresponding deparse_namespace aren't set correctly. Actually the inner_tlist is correct, but the inner_plan is set to the ModifyTable node, whereas it needs to be the outer child node -- in a MERGE, any references to the source relation will be INNER_VAR references to the targetlist of the join node immediately under the ModifyTable node. So I think we want to do something like the attached. Regards, Dean diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c new file mode 100644 index 2a1ee69..2231752 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -4988,8 +4988,11 @@ set_deparse_plan(deparse_namespace *dpns * For a WorkTableScan, locate the parent RecursiveUnion plan node and use * that as INNER referent. * - * For MERGE, make the inner tlist point to the merge source tlist, which - * is same as the targetlist that the ModifyTable's source plan provides. + * For MERGE, pretend the ModifyTable's source plan (its outer plan) is + * INNER referent. This is the join from the target relation to the data + * source, and all INNER_VAR Vars in other parts of the query refer to its + * targetlist. + * * For ON CONFLICT .. UPDATE we just need the inner tlist to point to the * excluded expression's tlist. (Similar to the SubqueryScan we don't want * to reuse OUTER, it's used for RETURNING in some modify table cases, @@ -5004,17 +5007,17 @@ set_deparse_plan(deparse_namespace *dpns dpns->inner_plan = find_recursive_union(dpns, (WorkTableScan *) plan); else if (IsA(plan, ModifyTable)) - dpns->inner_plan = plan; - else - dpns->inner_plan = innerPlan(plan); - - if (IsA(plan, ModifyTable)) { if (((ModifyTable *) plan)->operation == CMD_MERGE) - dpns->inner_tlist = dpns->outer_tlist; + dpns->inner_plan = outerPlan(plan); else - dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist; + dpns->inner_plan = plan; } + else + dpns->inner_plan = innerPlan(plan); + + if (IsA(plan, ModifyTable) && ((ModifyTable *) plan)->operation == CMD_INSERT) + dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist; else if (dpns->inner_plan) dpns->inner_tlist = dpns->inner_plan->targetlist; else diff --git a/src/test/regress/expected/merge.out b/src/test/regress/expected/merge.out new file mode 100644 index e3ebf46..1a6f6ad --- a/src/test/regress/expected/merge.out +++ b/src/test/regress/expected/merge.out @@ -1473,6 +1473,56 @@ WHEN MATCHED AND t.a < 10 THEN DROP TABLE ex_msource, ex_mtarget; DROP FUNCTION explain_merge(text); +-- EXPLAIN SubPlans and InitPlans +CREATE TABLE src (a int, b int, c int, d int); +CREATE TABLE tgt (a int, b int, c int, d int); +CREATE TABLE ref (ab int, cd int); +EXPLAIN (verbose, costs off) +MERGE INTO tgt t +USING (SELECT *, (SELECT count(*) FROM ref r + WHERE r.ab = s.a + s.b + AND r.cd = s.c - s.d) cnt + FROM src s) s +ON t.a = s.a AND t.b < s.cnt +WHEN MATCHED AND t.c > s.cnt THEN + UPDATE SET (b, c) = (SELECT s.b, s.cnt); + QUERY PLAN +-
Support json_errdetail in FRONTEND builds
Hello, Both the incremental JSON [1] and OAuth [2] patchsets would be improved by json_errdetail(), which was removed from FRONTEND builds in b44669b2ca: >The routine providing a detailed error message based on the error code >is made backend-only, the existing code being unsafe to use in the >frontend as the error message may finish by being palloc'd or point to a >static string, so there is no way to know if the memory of the message >should be pfree'd or not. Attached is a patch to undo this, by attaching any necessary allocations to the JsonLexContext so the caller doesn't have to keep track of them. This is based on the first patch of the OAuth patchset, which additionally needs json_errdetail() to be safe to use from libpq itself. Alvaro pointed out offlist that we don't have to go that far to re-enable this function for the utilities, so this patch is a sort of middle ground between what we have now and what OAuth implements. (There is some additional minimization that could be done to this patch, but I'm hoping to keep the code structure consistent between the two, if the result is acceptable.) Two notes that I wanted to point out explicitly: - On the other thread, Daniel contributed a destroyStringInfo() counterpart for makeStringInfo(), which is optional but seemed useful to include here. - After this patch, if a frontend client calls json_errdetail() without calling freeJsonLexContext(), it will leak memory. Thanks, --Jacob [1] https://www.postgresql.org/message-id/682c8fff-355c-a04f-57ac-81055c4ccda8%40dunslane.net [2] https://www.postgresql.org/message-id/CAOYmi%2BkKNZCL7uz-LHyBggM%2BfEcf4285pFWwm7spkUb8irY7mQ%40mail.gmail.com 0001-common-jsonapi-support-json_errdetail-in-FRONTEND.patch Description: Binary data
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On 2024-Mar-12, Alvaro Herrera wrote: > Hmm, buildfarm member kestrel (which uses > -fsanitize=undefined,alignment) failed: > > # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc > dbname='postgres' > test cancellations... > libpq_pipeline:260: query did not fail when it was expected Hm, I tried using the same compile flags, couldn't reproduce. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)
Re: On disable_cost
On Tue, Mar 12, 2024 at 1:32 PM Tom Lane wrote: > BTW, having written that paragraph, I wonder if we couldn't get > the same end result with a nearly one-line change that consists of > making disable_cost be IEEE infinity. Years ago we didn't want > to rely on IEEE float semantics in this area, but nowadays I don't > see why we shouldn't. I don't think so, because I think that what will happen in that case is that we'll pick a completely random plan if we can't pick a plan that avoids incurring disable_cost. Every plan that contains one disabled node anywhere in the plan tree will look like it has exactly the same cost as any other such plan. IMHO, this is actually one of the problems with disable_cost as it works today. I think the semantics that we want are: if it's possible to pick a plan where nothing is disabled, then pick the cheapest such plan; if not, pick the cheapest plan overall. But treating disable_cost doesn't really do that. It does the first part -- picking the cheapest plan where nothing is disabled -- but it doesn't do the second part, because once you add disable_cost into the cost of some particular plan node, it screws up the rest of the planning, because the cost estimates for the disabled nodes have no bearing in reality. Fast-start plans, for example, will look insanely good compared to what would be the case in normal planning (and we lean too much toward fast-start plans even normally). (I don't think we should care how MANY disabled nodes appear in a plan, particularly. This is a more arguable point. Is a plan with 1 disabled node and 10% more cost better or worse than a plan with 2 disabled nodes and 10% less cost? I'd argue that counting the number of disabled nodes isn't particularly meaningful.) -- Robert Haas EDB: http://www.enterprisedb.com
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Hmm, buildfarm member kestrel (which uses -fsanitize=undefined,alignment) failed: # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc dbname='postgres' test cancellations... libpq_pipeline:260: query did not fail when it was expected https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-03-12%2016%3A41%3A27 -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom." (Isaac Asimov)
Re: perl: unsafe empty pattern behavior
On 2024-Mar-12, Jeff Davis wrote: > Do not use perl empty patterns like // or qr// or s//.../, the behavior > is too surprising for perl non-experts. Yeah, nasty. > There are a few such uses in > our tests; patch attached. Unfortunately, there is no obvious way to > automatically detect them so I am just relying on grep. I'm sure there > are others here who know more about perl than I do, so > suggestions/corrections are welcome. I suggest that pg_dump/t/002_pg_dump.pl could use a verification that the ->{regexp} thing is not empty. I also tried grepping (for things like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you have ... but I only looked for the "qr" literal, not other ways to get regexes. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'. After collecting 500 such letters, he mused, a university somewhere in Arizona would probably grant him a degree. (Don Knuth)
Re: On disable_cost
Robert Haas writes: > On Thu, Aug 3, 2023 at 5:22 AM Jian Guo wrote: >> I have write an initial patch to retire the disable_cost GUC, which labeled >> a flag on the Path struct instead of adding up a big cost which is hard to >> estimate. Though it involved in tons of plan changes in regression tests, I >> have tested on some simple test cases such as eagerly generate a two-stage >> agg plans and it worked. Could someone help to review? > I took a look at this patch today. I believe that overall this may > well be an approach worth pursuing. However, more work is going to be > needed. Here are some comments: > 1. You stated that it changes lots of plans in the regression tests, > but you haven't provided any sort of analysis of why those plans > changed. I'm kind of surprised that there would be "tons" of plan > changes. You (or someone) should look into why that's happening. I've not read the patch, but given this description I would expect there to be *zero* regression changes --- I don't think we have any test cases that depend on disable_cost being finite. If there's more than zero changes, that very likely indicates a bug in the patch. Even if there are places where the output legitimately changes, you need to justify each one and make sure that you didn't invalidate the intent of that test case. BTW, having written that paragraph, I wonder if we couldn't get the same end result with a nearly one-line change that consists of making disable_cost be IEEE infinity. Years ago we didn't want to rely on IEEE float semantics in this area, but nowadays I don't see why we shouldn't. regards, tom lane
Re: UUID v7
On Tue, 12 Mar 2024 at 18:18, Sergey Prokhorenko wrote: > Andrey and you simply did not read the RFC a little further down in the text: You're totally right, sorry about that. Maybe it would be good to move those subsections around a bit in the RFC though, so that anything related to only one method is included in the section for that method.
perl: unsafe empty pattern behavior
Moved from discussion on -committers: https://postgr.es/m/0ef325fa06e7a1605c4e119c4ecb637c67e5fb4e.ca...@j-davis.com Summary: Do not use perl empty patterns like // or qr// or s//.../, the behavior is too surprising for perl non-experts. There are a few such uses in our tests; patch attached. Unfortunately, there is no obvious way to automatically detect them so I am just relying on grep. I'm sure there are others here who know more about perl than I do, so suggestions/corrections are welcome. Long version: Some may know this already, but we just discovered the dangers of using empty patterns in perl: "If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead... If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match)." https://perldoc.perl.org/perlop#The-empty-pattern-// In other words, if you have code like: if ('xyz' =~ //) { print "'xyz' matches //\n"; } The match will succeed and print, because there's no previous pattern, so // is a "genuine" empty pattern, which is treated like /.*/ (I think?). Then, if you add some other code before it: if ('abc' =~ /abc/) { print "'abc' matches /abc/\n"; } if ('xyz' =~ //) { print "'xyz' matches //\n"; } The first match will succeed, but the second match will fail, because // is treated like /abc/. On reflection, that does seem very perl-like. But it can cause surprising action-at-a-distance if not used carefully, especially for those who aren't experts in perl. It's much safer to just not use the empty pattern. If you use qr// instead: https://perldoc.perl.org/perlop#qr/STRING/msixpodualn like: if ('abc' =~ qr/abc/) { print "'abc' matches /abc/\n"; } if ('xyz' =~ qr//) { print "'xyz' matches //\n"; } Then the second match may succeed or may fail, and it's not clear from the documentation what precise circumstances matter. It seems to fail on older versions of perl (like 5.16.3) and succeed on newer versions (5.38.2). However, it may also depend on when the qr// is [re]compiled, or regex flags, or locale, or may just be undefined. Regards, Jeff Davis From be5aa677e37180a8c1b0faebcceab5506b1c8130 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Mon, 11 Mar 2024 16:44:56 -0700 Subject: [PATCH v1] perl: avoid empty regex patterns Empty patterns have special behavior that uses the last successful pattern match. This behavior can be surprising, so remove empty patterns and instead match against exactly what is intended (e.g. /^$/ or /.*/). Unfortunately there's not an easy way to check for this in an automated way, so it's likely that some cases have been missed and will be missed in the future. This commit just cleans up known instances. Discussion: https://postgr.es/m/1548559.1710188...@sss.pgh.pa.us --- src/bin/pg_upgrade/t/003_logical_slots.pl | 4 ++-- src/bin/pg_upgrade/t/004_subscription.pl| 2 +- src/bin/psql/t/001_basic.pl | 8 src/bin/psql/t/010_tab_completion.pl| 2 +- src/test/recovery/t/037_invalid_database.pl | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl index 83d71c3084..256dfd53b1 100644 --- a/src/bin/pg_upgrade/t/003_logical_slots.pl +++ b/src/bin/pg_upgrade/t/003_logical_slots.pl @@ -78,7 +78,7 @@ command_checks_all( [ qr/max_replication_slots \(1\) must be greater than or equal to the number of logical replication slots \(2\) on the old cluster/ ], - [qr//], + [qr/^$/], 'run of pg_upgrade where the new cluster has insufficient max_replication_slots' ); ok( -d $newpub->data_dir . "/pg_upgrade_output.d", @@ -118,7 +118,7 @@ command_checks_all( [ qr/Your installation contains logical replication slots that can't be upgraded./ ], - [qr//], + [qr/^$/], 'run of pg_upgrade of old cluster with slots having unconsumed WAL records' ); diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl index df5d6dffbc..c8ee2390d1 100644 --- a/src/bin/pg_upgrade/t/004_subscription.pl +++ b/src/bin/pg_upgrade/t/004_subscription.pl @@ -68,7 +68,7 @@ command_checks_all( [ qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/ ], - [qr//], + [qr/^$/], 'run of pg_upgrade where the new cluster has insufficient max_replication_slots' ); diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl index 9f0b6cf8ca..ce875ce316 100644 --- a/src/bin/psql/t/001_basic.pl +++ b/src/bin/psql/t/001_basic.pl @@ -412,23 +412,23 @@ my $perlbin = $^X; $perlbin =~ s!\\!/!g if $PostgreSQL::Test::Utils::windows_os; my $pipe_cmd = "$perlbin -pe '' >$g_file"; -psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr//, "one command \\g"); +psql_like($node, "SELECT 'one' \\g
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Mon, Mar 11, 2024 at 4:09 PM Amit Kapila wrote: > > I don't see how it will be easier for the user to choose the default > value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But, > I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be > another parameter to allow vacuum to proceed removing the rows which > otherwise it wouldn't have been as those would be required by some > slot. Now, if this understanding is correct, we should probably make > this invalidation happen by (auto)vacuum after computing the age based > on this new parameter. Currently, the patch computes the XID age in the checkpointer using the next XID (gets from ReadNextFullTransactionId()) and slot's xmin and catalog_xmin. I think the checkpointer is best because it already invalidates slots for wal_removed cause, and flushes all replication slots to disk. Moving this new invalidation functionality into some other background process such as autovacuum will not only burden that process' work but also mix up the unique functionality of that background process. Having said above, I'm open to ideas from others as I'm not so sure if there's any issue with checkpointer invalidating the slots for new reasons. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: UUID v7
Hi Jelte, I am one of the contributors to this RFC. Andrey's patch corresponds exactly to Fixed-Length Dedicated Counter Bits (Method 1). Andrey and you simply did not read the RFC a little further down in the text: __ The following sub-topics cover topics related solely with creating reliable fixed-length dedicated counters: - Fixed-Length Dedicated Counter Seeding: - Implementations utilizing the fixed-length counter method randomly initialize the counter with each new timestamp tick. However, when the timestamp has not increased, the counter is instead incremented by the desired increment logic. When utilizing a randomly seeded counter alongside Method 1, the random value MAY be regenerated with each counter increment without impacting sortability. The downside is that Method 1 is prone to overflows if a counter of adequate length is not selected or the random data generated leaves little room for the required number of increments. Implementations utilizing fixed-length counter method MAY also choose to randomly initialize a portion of the counter rather than the entire counter. For example, a 24 bit counter could have the 23 bits in least-significant, right-most, position randomly initialized. The remaining most significant, left-most counter bit is initialized as zero for the sole purpose of guarding against counter rollovers. - - Fixed-Length Dedicated Counter Length: - Select a counter bit-length that can properly handle the level of timestamp precision in use. For example, millisecond precision generally requires a larger counter than a timestamp with nanosecond precision. General guidance is that the counter SHOULD be at least 12 bits but no longer than 42 bits. Care must be taken to ensure that the counter length selected leaves room for sufficient entropy in the random portion of the UUID after the counter. This entropy helps improve the unguessability characteristics of UUIDs created within the batch. - The following sub-topics cover rollover handling with either type of counter method: - ... - - Counter Rollover Handling: - Counter rollovers MUST be handled by the application to avoid sorting issues. The general guidance is that applications that care about absolute monotonicity and sortability should freeze the counter and wait for the timestamp to advance which ensures monotonicity is not broken. Alternatively, implementations MAY increment the timestamp ahead of the actual time and reinitialize the counter. Sergey prokhorenkosergeyprokhore...@yahoo.com.au On Tuesday, 12 March 2024 at 06:36:13 pm GMT+3, Jelte Fennema-Nio wrote: On Mon, 11 Mar 2024 at 19:27, Andrey M. Borodin wrote: > Sorry for this long and vague explanation, if it still seems too uncertain we > can have a chat or something like that. I don't think this number picking > stuff deserve to be commented, because it still is quite close to random. RFC > gives us too much freedom of choice. I thought your explanation was quite clear and I agree that this approach makes the most sense. I sent an email to the RFC authors to ask for their feedback with you (Andrey) in the CC, because even though it makes the most sense it does not comply with the either of method 1 or 2 as described in the RFC.
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On 2024-Mar-12, Jelte Fennema-Nio wrote: > On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera wrote: > > Here's a last one for the cfbot. > > Thanks for committing the first 3 patches btw. Attached a tiny change > to 0001, which adds "(backing struct for PGcancelConn)" to the comment > on pg_cancel_conn. Thanks, I included it. I hope there were no other changes, because I didn't verify :-) but if there were, please let me know to incorporate them. I made a number of other small changes, mostly to the documentation, nothing fundamental. (Someday we should stop using to document the libpq functions and use refentry's instead ... it'd be useful to have manpages for these functions.) One thing I don't like very much is release_conn_addrinfo(), which is called conditionally in two places but unconditionally in other places. Maybe it'd make more sense to put this conditionality inside the function itself, possibly with a "bool force" flag to suppress that in the cases where it is not desired. In pqConnectDBComplete, we cast the PGconn * to PGcancelConn * in order to call PQcancelPoll, which is a bit abusive, but I don't know how to do better. Maybe we just accept this ... but if PQcancelStart is the only way to have pqConnectDBStart called from a cancel connection, maybe it'd be saner to duplicate pqConnectDBStart for cancel conns. Thanks! -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila wrote: > > Yes, your understanding is correct. I wanted us to consider having new > parameters like 'inactive_replication_slot_timeout' to be at > slot-level instead of GUC. I think this new parameter doesn't seem to > be the similar as 'max_slot_wal_keep_size' which leads to truncation > of WAL at global and then invalidates the appropriate slots. OTOH, the > 'inactive_replication_slot_timeout' doesn't appear to have a similar > global effect. last_inactive_at is tracked for each slot using which slots get invalidated based on inactive_replication_slot_timeout. It's like max_slot_wal_keep_size invalidating slots based on restart_lsn. In a way, both are similar, right? > The other thing we should consider is what if the > checkpoint happens at a timeout greater than > 'inactive_replication_slot_timeout'? In such a case, the slots get invalidated upon the next checkpoint as the (current_checkpointer_timeout - last_inactive_at) will then be greater than inactive_replication_slot_timeout. > Shall, we consider doing it via > some other background process or do we think checkpointer is the best > we can have? The same problem exists if we do it with some other background process. I think the checkpointer is best because it already invalidates slots for wal_removed cause, and flushes all replication slots to disk. Moving this new invalidation functionality into some other background process such as autovacuum will not only burden that process' work but also mix up the unique functionality of that background process. Having said above, I'm open to ideas from others as I'm not so sure if there's any issue with checkpointer invalidating the slots for new reasons. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Statistics Import and Export
> > No, we should be keeping the lock until the end of the transaction > (which in this case would be just the one statement, but it would be the > whole statement and all of the calls in it). See analyze.c:268 or > so, where we call relation_close(onerel, NoLock); meaning we're closing > the relation but we're *not* releasing the lock on it- it'll get > released at the end of the transaction. > > If that's the case, then changing the two table_close() statements to NoLock should resolve any remaining concern.
Re: type cache cleanup improvements
Got it. Here is an updated patch where I added a corresponding comment. Thank you! Playing around I found one more place which could easily modified with hash_seq_init_with_hash_value() call. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c index af978ccd4b1..28980620662 100644 --- a/src/backend/utils/cache/attoptcache.c +++ b/src/backend/utils/cache/attoptcache.c @@ -44,12 +44,10 @@ typedef struct /* * InvalidateAttoptCacheCallback - * Flush all cache entries when pg_attribute is updated. + * Flush cache entry (or entries) when pg_attribute is updated. * * When pg_attribute is updated, we must flush the cache entry at least - * for that attribute. Currently, we just flush them all. Since attribute - * options are not currently used in performance-critical paths (such as - * query execution), this seems OK. + * for that attribute. */ static void InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) @@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) HASH_SEQ_STATUS status; AttoptCacheEntry *attopt; - hash_seq_init(&status, AttoptCacheHash); + /* + * By convection, zero hash value is passed to the callback as a sign + * that it's time to invalidate the cache. See sinval.c, inval.c and + * InvalidateSystemCachesExtended(). + */ + if (hashvalue == 0) + hash_seq_init(&status, AttoptCacheHash); + else + hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue); + while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL) { if (attopt->opts) @@ -70,6 +77,17 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) } } +/* + * Hash function compatible with two-arg system cache hash function. + */ +static uint32 +relatt_cache_syshash(const void *key, Size keysize) +{ + const AttoptCacheKey* ckey = key; + + return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum); +} + /* * InitializeAttoptCache * Initialize the attribute options cache. @@ -82,9 +100,17 @@ InitializeAttoptCache(void) /* Initialize the hash table. */ ctl.keysize = sizeof(AttoptCacheKey); ctl.entrysize = sizeof(AttoptCacheEntry); + + /* + * AttoptCacheEntry takes hash value from the system cache. For + * AttoptCacheHash we use the same hash in order to speedup search by hash + * value. This is used by hash_seq_init_with_hash_value(). + */ + ctl.hash = relatt_cache_syshash; + AttoptCacheHash = hash_create("Attopt cache", 256, &ctl, - HASH_ELEM | HASH_BLOBS); + HASH_ELEM | HASH_FUNCTION); /* Make sure we've initialized CacheMemoryContext. */ if (!CacheMemoryContext)
Re: CI speed improvements for FreeBSD
Hi! I looked at the changes and I liked them. Here are my thoughts: 0001: 1. I think, this is a good idea to use RAM. Since, it's still a UFS, and we lose nothing in terms of testing, but win in speed significantly. 2. Change from "swapoff -a || true" to "swapoff -a" is legit in my view, since it's better to explicitly fail than silent any possible problem. 3. Man says that lowercase suffixes should be used for the mdconfig. But in fact, you can use either lowercase or an appercase. Yep, it's in the mdconfig.c: "else if (*p == 'g' || *p == 'G')". 0002: 1. The resource usage should be a bit higher, this is for sure. But, if I'm not missing something, not drastically. Anyway, I do not know how to measure this increase to get concrete values. 2. And think of a potential benefits of increasing the number of test jobs: more concurrent processes, more interactions, better test coverage. Here are my runs: FreeBSD @master https://cirrus-ci.com/task/4934701194936320 Run test_world 05:56 FreeBSD @master + 0001 https://cirrus-ci.com/task/5921385306914816 Run test_world 05:06 FreeBSD @master + 0001, + 0002 https://cirrus-ci.com/task/5635288945393664 Run test_world 02:20 For comparison Debian @master https://cirrus-ci.com/task/5143705577848832 Run test_world 02:38 In the overall, I consider this changes useful. CI run faster, with better test coverage in exchange for presumably slight increase in resource usage, but I don't think this increase should be significant. -- Best regards, Maxim Orlov.
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot wrote: > > > AFAIR, we don't prevent similar invalidations due to > > 'max_slot_wal_keep_size' for sync slots, > > Right, we'd invalidate them on the standby should the standby sync slot > restart_lsn > exceeds the limit. Right. Help me understand this a bit - is the wal_removed invalidation going to conflict with recovery on the standby? Per the discussion upthread, I'm trying to understand what invalidation reasons will exactly cause conflict with recovery? Is it just rows_removed and wal_level_insufficient invalidations? My understanding on the conflict with recovery and invalidation reason has been a bit off track. Perhaps, we need to clarify these two things in the docs for the end users as well? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On Tue, 12 Mar 2024 at 15:04, Jelte Fennema-Nio wrote: > Attached a tiny change to 0001 One more tiny comment change, stating that pg_cancel is used by the deprecated PQcancel function. v36-0001-libpq-Add-encrypted-and-non-blocking-query-cance.patch Description: Binary data v36-0002-Start-using-new-libpq-cancel-APIs.patch Description: Binary data
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Tue, Mar 12, 2024 at 5:51 PM Amit Kapila wrote: > > > Would that make sense to "simply" discard/prevent those kind of > > invalidations > > for "synced" slot on standby? I mean, do they make sense given the fact that > > those slots are not usable until the standby is promoted? > > AFAIR, we don't prevent similar invalidations due to > 'max_slot_wal_keep_size' for sync slots, so why to prevent it for > these new parameters? This will unnecessarily create inconsistency in > the invalidation behavior. Right. +1 to keep the behaviour consistent for all invalidations. However, an assertion that inactive_timeout isn't set for synced slots on the standby isn't a bad idea because we rely on the fact that walsenders aren't started for synced slots. Again, I think it misses the consistency in the invalidation behaviour. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: UUID v7
On Tue, 12 Mar 2024 at 07:32, Michael Paquier wrote: > Sure, there is no problem in discussing a patch to implement a > behavior. But I disagree about taking a risk in merging something > that could become non-compliant with the approved RFC, if the draft is > approved at the end, of course. This just strikes me as a bad idea. I agree that we shouldn't release UUIDv7 support if the RFC describing that is not yet approved. But I do think it would be a shame if e.g. the RFC got approved 2 weeks after Postgres its feature freeze. Which would then mean we'd have to wait another 1.5 years before actually using uuidv7. Would it be a reasonable compromise to still merge the patch for PG17 (assuming the code is good to merge with regards to the current draft RFC), but revert the commit if the RFC is not approved before some deadline before the release date (e.g. before the first release candidate)?
Re: Introduce XID age and inactive timeout based replication slot invalidation
Hi, On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote: > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot > wrote: > > > > On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote: > > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila > > > wrote: > > > > > > > > You might want to consider its interaction with sync slots on standby. > > > > Say, there is no activity on slots in terms of processing the changes > > > > for slots. Now, we won't perform sync of such slots on standby showing > > > > them inactive as per your new criteria where as same slots could still > > > > be valid on primary as the walsender is still active. This may be more > > > > of a theoretical point as in running system there will probably be > > > > some activity but I think this needs some thougths. > > > > > > I believe the xmin and catalog_xmin of the sync slots on the standby > > > keep advancing depending on the slots on the primary, no? If yes, the > > > XID age based invalidation shouldn't be a problem. > > > > > > I believe there are no walsenders started for the sync slots on the > > > standbys, right? If yes, the inactive timeout based invalidation also > > > shouldn't be a problem. Because, the inactive timeouts for a slot are > > > tracked only for walsenders because they are the ones that typically > > > hold replication slots for longer durations and for real replication > > > use. We did a similar thing in a recent commit [1]. > > > > > > Is my understanding right? Do you still see any problems with it? > > > > Would that make sense to "simply" discard/prevent those kind of > > invalidations > > for "synced" slot on standby? I mean, do they make sense given the fact that > > those slots are not usable until the standby is promoted? > > > > AFAIR, we don't prevent similar invalidations due to > 'max_slot_wal_keep_size' for sync slots, Right, we'd invalidate them on the standby should the standby sync slot restart_lsn exceeds the limit. > so why to prevent it for > these new parameters? This will unnecessarily create inconsistency in > the invalidation behavior. Yeah, but I think wal removal has a direct impact on the slot usuability which is probably not the case with the new XID and Timeout ones. That's why I thought about handling them differently (but I'm also fine if that's not the case). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: UUID v7
On Mon, 11 Mar 2024 at 19:27, Andrey M. Borodin wrote: > Sorry for this long and vague explanation, if it still seems too uncertain we > can have a chat or something like that. I don't think this number picking > stuff deserve to be commented, because it still is quite close to random. RFC > gives us too much freedom of choice. I thought your explanation was quite clear and I agree that this approach makes the most sense. I sent an email to the RFC authors to ask for their feedback with you (Andrey) in the CC, because even though it makes the most sense it does not comply with the either of method 1 or 2 as described in the RFC.
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila wrote: > > > Hm. I get the concern. Are you okay with having inavlidation_reason > > separately for both logical and physical slots? In such a case, > > logical slots that got invalidated on the standby will have duplicate > > info in conflict_reason and invalidation_reason, is this fine? > > > > If we have duplicate information in two columns that could be > confusing for users. BTW, isn't the recovery conflict occur only > because of rows_removed and wal_level_insufficient reasons? The > wal_removed or the new reasons you are proposing can't happen because > of recovery conflict. Am, I missing something here? My understanding aligns with yours that the rows_removed and wal_level_insufficient invalidations can occur only upon recovery conflict. FWIW, a test named 'synchronized slot has been invalidated' in 040_standby_failover_slots_sync.pl inappropriately uses conflict_reason = 'wal_removed' logical slot on standby. As per the above understanding, it's inappropriate to use conflict_reason here because wal_removed invalidation doesn't conflict with recovery. > > Another idea is to make 'conflict_reason text' as a 'conflicting > > boolean' again (revert 007693f2a3), and have 'invalidation_reason > > text' for both logical and physical slots. So, whenever 'conflicting' > > is true, one can look at invalidation_reason for the reason for > > conflict. How does this sound? > > > > So, does this mean that conflicting will only be true for some of the > reasons (say wal_level_insufficient, rows_removed, wal_removed) and > logical slots but not for others? I think that will also not eliminate > the duplicate information as user could have deduced that from single > column. So, how about we turn conflict_reason to only report the reasons that actually cause conflict with recovery for logical slots, something like below, and then have invalidation_cause as a generic column for all sorts of invalidation reasons for both logical and physical slots? ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated; if (slot_contents.data.database == InvalidOid || cause == RS_INVAL_NONE || cause != RS_INVAL_HORIZON || cause != RS_INVAL_WAL_LEVEL) { nulls[i++] = true; } else { Assert(cause == RS_INVAL_HORIZON || cause == RS_INVAL_WAL_LEVEL); values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]); } -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: POC, WIP: OR-clause support for indexes
On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov wrote: > On 11/3/2024 18:31, Alexander Korotkov wrote: > >> I'm not convinced about this limit. The initial reason was to combine > >> long lists of ORs into the array because such a transformation made at > >> an early stage increases efficiency. > >> I understand the necessity of this limit in the array decomposition > >> routine but not in the creation one. > > > > The comment near MAX_SAOP_ARRAY_SIZE says that this limit is because > > N^2 algorithms could be applied to arrays. Are you sure that's not > > true for our case? > When you operate an array, indeed. But when we transform ORs to an > array, not. Just check all the places in the optimizer and even the > executor where we would pass along the list of ORs. This is why I think > we should use this optimization even more intensively for huge numbers > of ORs in an attempt to speed up the overall query. Ok. > >>> 3) Better save the original order of clauses by putting hash entries and > >>> untransformable clauses to the same list. A lot of differences in > >>> regression tests output have gone. > >> I agree that reducing the number of changes in regression tests looks > >> better. But to achieve this, you introduced a hack that increases the > >> complexity of the code. Is it worth it? Maybe it would be better to make > >> one-time changes in tests instead of getting this burden on board. Or > >> have you meant something more introducing the node type? > > > > For me the reason is not just a regression test. The current code > > keeps the original order of quals as much as possible. The OR > > transformation code reorders quals even in cases when it doesn't > > eventually apply any optimization. I don't think that's acceptable. > > However, less hackery ways for this is welcome for sure. > Why is it unacceptable? Can the user implement some order-dependent > logic with clauses, and will it be correct? > Otherwise, it is a matter of taste, and generally, this decision is up > to you. I think this is an important property that the user sees the quals in the plan in the same order as they were in the query. And if some transformations are applied, then the order is saved as much as possible. I don't think we should sacrifice this property without strong reasons. A bit of code complexity is definitely not that reason for me. > >>> We don't make array values unique. That might make query execution > >>> performance somewhat worse, and also makes selectivity estimation > >>> worse. I suggest Andrei and/or Alena should implement making array > >>> values unique. > >> The fix Alena has made looks correct. But I urge you to think twice: > >> The optimizer doesn't care about duplicates, so why do we do it? > >> What's more, this optimization is intended to speed up queries with long > >> OR lists. Using the list_append_unique() comparator on such lists could > >> impact performance. I suggest sticking to the common rule and leaving > >> the responsibility on the user's shoulders. > > > > I don't see why the optimizer doesn't care about duplicates for OR > > lists. As I showed before in an example, it successfully removes the > > duplicate. So, currently OR transformation clearly introduces a > > regression in terms of selectivity estimation. I think we should > > evade that. > I think you are right. It is probably a better place than any other to > remove duplicates in an array. I just think we should sort and remove > duplicates from entry->consts in one pass. Thus, this optimisation > should be applied to sortable constants. Ok. > >> At least, we should do this optimization later, in one pass, with > >> sorting elements before building the array. But what if we don't have a > >> sort operator for the type? > > > > It was probably discussed before, but can we do our work later? There > > is a canonicalize_qual() which calls find_duplicate_ors(). This is > > the place where currently duplicate OR clauses are removed. Could our > > OR-to-ANY transformation be just another call from > > canonicalize_qual()? > Hmm, we already tried to do it at that point. I vaguely recall some > issues caused by this approach. Anyway, it should be done as quickly as > possible to increase the effect of the optimization. I think there were provided quite strong reasons why this shouldn't be implemented at the parse analysis stage [1], [2], [3]. The canonicalize_qual() looks quite appropriate place for that since it does similar transformations. Links. 1. https://www.postgresql.org/message-id/CA%2BTgmoZCgP6FrBQEusn4yaWm02XU8OPeoEMk91q7PRBgwaAkFw%40mail.gmail.com 2. https://www.postgresql.org/message-id/CAH2-Wzm2%3Dnf_JhiM3A2yetxRs8Nd2NuN3JqH%3Dfm_YWYd1oYoPg%40mail.gmail.com 3. https://www.postgresql.org/message-id/CA%2BTgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1%2BCuVQE-eFTqBmA%40mail.gmail.com -- Regards, Alexander Korotkov
Re: A failure in t/038_save_logical_slots_shutdown.pl
On Tue, Mar 12, 2024 at 6:05 PM Amit Kapila wrote: > > The patch looks mostly good to me. I have changed the comments and > commit message in the attached. I am planning to push this tomorrow > unless you or others have any comments on it. LGTM. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Disable LLVM bitcode generation with pgxs.mk framework.
> On 12 Mar 2024, at 14:38, Xing Guo wrote: > Would it be possible to add a new switch in the pgxs.mk framework to > allow users to disable this feature? Something like that doesn't seem unreasonable I think. -- Daniel Gustafsson
Re: confusing `case when` column name
> On 12 Mar 2024, at 15:19, Tom Lane wrote: > users are probably going to end up applying an AS clause most of > the time if they care about the column name at all. If anything, we could perhaps add a short note in the CASE documentation about column naming, the way it's phrased now a new user could very easily assume it would be "case". -- Daniel Gustafsson
Re: On disable_cost
On Thu, Aug 3, 2023 at 5:22 AM Jian Guo wrote: > I have write an initial patch to retire the disable_cost GUC, which labeled a > flag on the Path struct instead of adding up a big cost which is hard to > estimate. Though it involved in tons of plan changes in regression tests, I > have tested on some simple test cases such as eagerly generate a two-stage > agg plans and it worked. Could someone help to review? I took a look at this patch today. I believe that overall this may well be an approach worth pursuing. However, more work is going to be needed. Here are some comments: 1. You stated that it changes lots of plans in the regression tests, but you haven't provided any sort of analysis of why those plans changed. I'm kind of surprised that there would be "tons" of plan changes. You (or someone) should look into why that's happening. 2. The change to compare_path_costs_fuzzily() seems incorrect to me. When path1->is_disabled && path2->is_disabled, costs should be compared just as they are when neither path is disabled. Instead, the patch treats any two such paths as having equal cost. That seems catastrophically bad. Maybe it accounts for some of those plan changes, although that would only be true if those plans were created while using some disabling GUC. 3. Instead of adding is_disabled at the end of the Path structure, I suggest adding it between param_info and parallel_aware. I think if you do that, the space used by the new field will use up padding bytes that are currently included in the struct, instead of making it longer. 4. A critical issue for any patch of this type is performance. This concern was raised earlier on this thread, but your path doesn't address it. There's no performance analysis or benchmarking included in your email. One idea that I have is to write the cost-comparison test like this: if (unlikely(path1->is_disabled || path2->is_disabled)) { if (!path1->is_disabled) return COSTS_BETTER1; if (!path2->is_disabled) return COSTS_BETTER2; /* if both disabled, fall through */ } I'm not sure that would be enough to prevent the patch from adding noticeably to the cost of path comparison, but maybe it would help. 5. The patch changes only compare_path_costs_fuzzily() but I wonder whether compare_path_costs() and compare_fractional_path_costs() need similar surgery. Whether they do or don't, there should likely be some comments explaining the situation. 6. In fact, the patch changes no comments at all, anywhere. I'm not sure how many comment changes a patch like this needs to make, but the answer definitely isn't "none". 7. The patch doesn't actually remove disable_cost. I guess it should. 8. When you submit a patch, it's a good idea to also add it on commitfest.postgresql.org. It doesn't look like that was done in this case. -- Robert Haas EDB: http://www.enterprisedb.com
Re: confusing `case when` column name
"David G. Johnston" writes: > On Tuesday, March 12, 2024, adjk...@126.com wrote: >> Nee we change the title of the case-when output column? > Choosing either a or b as the label seems wrong and probably worth changing > to something that has no meaning and encourages the application of a column > alias. Yeah, we won't get any kudos for changing a rule that's stood for 25 years, even if it's not very good. This is one of the places where it's just hard to make a great choice automatically, and users are probably going to end up applying an AS clause most of the time if they care about the column name at all. regards, tom lane
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera wrote: > Here's a last one for the cfbot. Thanks for committing the first 3 patches btw. Attached a tiny change to 0001, which adds "(backing struct for PGcancelConn)" to the comment on pg_cancel_conn. From d340fde6883a249fd7c1a90033675a3b5edb603e Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Thu, 14 Dec 2023 13:39:09 +0100 Subject: [PATCH v35 2/2] Start using new libpq cancel APIs A previous commit introduced new APIs to libpq for cancelling queries. This replaces the usage of the old APIs in most of the codebase with these newer ones. This specifically leaves out changes to psql and pgbench as those would need a much larger refactor to be able to call them, due to the new functions not being signal-safe. --- contrib/dblink/dblink.c | 30 +++-- contrib/postgres_fdw/connection.c | 105 +++--- .../postgres_fdw/expected/postgres_fdw.out| 15 +++ contrib/postgres_fdw/sql/postgres_fdw.sql | 7 ++ src/fe_utils/connect_utils.c | 11 +- src/test/isolation/isolationtester.c | 29 ++--- 6 files changed, 145 insertions(+), 52 deletions(-) diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c index 19a362526d2..98dcca3e6fd 100644 --- a/contrib/dblink/dblink.c +++ b/contrib/dblink/dblink.c @@ -1346,22 +1346,32 @@ PG_FUNCTION_INFO_V1(dblink_cancel_query); Datum dblink_cancel_query(PG_FUNCTION_ARGS) { - int res; PGconn *conn; - PGcancel *cancel; - char errbuf[256]; + PGcancelConn *cancelConn; + char *msg; dblink_init(); conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0))); - cancel = PQgetCancel(conn); + cancelConn = PQcancelCreate(conn); - res = PQcancel(cancel, errbuf, 256); - PQfreeCancel(cancel); + PG_TRY(); + { + if (!PQcancelBlocking(cancelConn)) + { + msg = pchomp(PQcancelErrorMessage(cancelConn)); + } + else + { + msg = "OK"; + } + } + PG_FINALLY(); + { + PQcancelFinish(cancelConn); + } + PG_END_TRY(); - if (res == 1) - PG_RETURN_TEXT_P(cstring_to_text("OK")); - else - PG_RETURN_TEXT_P(cstring_to_text(errbuf)); + PG_RETURN_TEXT_P(cstring_to_text(msg)); } diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 4931ebf5915..dcc13dc3b24 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -133,7 +133,7 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue); static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry); static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel); static bool pgfdw_cancel_query(PGconn *conn); -static bool pgfdw_cancel_query_begin(PGconn *conn); +static bool pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime); static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input); static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query, @@ -1315,36 +1315,104 @@ pgfdw_cancel_query(PGconn *conn) endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), CONNECTION_CLEANUP_TIMEOUT); - if (!pgfdw_cancel_query_begin(conn)) + if (!pgfdw_cancel_query_begin(conn, endtime)) return false; return pgfdw_cancel_query_end(conn, endtime, false); } static bool -pgfdw_cancel_query_begin(PGconn *conn) +pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime) { - PGcancel *cancel; - char errbuf[256]; + bool timed_out = false; + bool failed = false; + PGcancelConn *cancel_conn = PQcancelCreate(conn); - /* - * Issue cancel request. Unfortunately, there's no good way to limit the - * amount of time that we might block inside PQgetCancel(). - */ - if ((cancel = PQgetCancel(conn))) + + if (!PQcancelStart(cancel_conn)) { - if (!PQcancel(cancel, errbuf, sizeof(errbuf))) + PG_TRY(); { ereport(WARNING, (errcode(ERRCODE_CONNECTION_FAILURE), errmsg("could not send cancel request: %s", - errbuf))); - PQfreeCancel(cancel); - return false; + pchomp(PQcancelErrorMessage(cancel_conn); } - PQfreeCancel(cancel); + PG_FINALLY(); + { + PQcancelFinish(cancel_conn); + } + PG_END_TRY(); + return false; } - return true; + /* In what follows, do not leak any PGcancelConn on an error. */ + PG_TRY(); + { + while (true) + { + TimestampTz now = GetCurrentTimestamp(); + long cur_timeout; + PostgresPollingStatusType pollres = PQcancelPoll(cancel_conn); + int waitEvents = WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH; + + if (pollres == PGRES_POLLING_OK) + { +break; + } + + /* If timeout has expired, give up, else get sleep time. */ + cur_timeout = TimestampDifferenceMilliseconds(now, endtime); + if (cur_timeout <= 0) + { +timed_out = true; +failed = true; +goto exit; + } + + switch (pollres) + { +case PGRES_POLLING_READING: + waitEvents |= WL_SOCKET_READABLE; + break; +case PGRES_POL
Re: CF entries for 17 to be reviewed
Hi, > > Aleksander, I would greatly appreciate if you join me in managing CF. > > Together we can move more stuff :) > > Currently, I'm going through "SQL Commands". And so far I had not come to > > "Performance" and "Server Features" at all... So if you can handle updating > > statuses of that sections - that would be great. > > Server Features: > > [...] I noticed that "Avoid mixing custom and OpenSSL BIO functions" had two entries [1][2]. I closed [1] and marked it as "Withdrawn" due to lack of a better status. Maybe we need an additional "Duplicate" status. [1]: https://commitfest.postgresql.org/47/4834/ [2]: https://commitfest.postgresql.org/47/4835/ -- Best regards, Aleksander Alekseev
Re: Commitfest Manager for March
Hi Andrey, > > If you need any help please let me know. > > Aleksander, I would greatly appreciate if you join me in managing CF. > Together we can move more stuff :) > Currently, I'm going through "SQL Commands". And so far I had not come to > "Performance" and "Server Features" at all... So if you can handle updating > statuses of that sections - that would be great. OK, I'll take care of the "Performance" and "Server Features" sections. I submitted my summaries of the entries triaged so far to the corresponding thread [1]. [1]: https://www.postgresql.org/message-id/CAJ7c6TN9SnYdq%3DkfP-txgo5AaT%2Bt9YU%2BvQHfLBZqOBiHwoipAg%40mail.gmail.com -- Best regards, Aleksander Alekseev
Disable LLVM bitcode generation with pgxs.mk framework.
Hi hackers, When the PostgreSQL server is configured with --with-llvm, the pgxs.mk framework will generate LLVM bitcode for extensions automatically. Sometimes, I don't want to generate bitcode for some extensions. I can turn off this feature by specifying with_llvm=0 in the make command. ``` make with_llvm=0 ``` Would it be possible to add a new switch in the pgxs.mk framework to allow users to disable this feature? E.g., the Makefile looks like: ``` WITH_LLVM=no PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) ``` Best Regards, Xing
Re: [PATCH] LockAcquireExtended improvement
On Mon, Mar 11, 2024 at 11:11 PM Jingxian Li wrote: > Your version changes less code than mine by pushing the nowait flag down > into ProcSleep(). This looks fine in general, except for a little advice, > which I don't think there is necessary to add 'waiting' suffix to the > process name in function WaitOnLock with dontwait being true, as follows: That could be done, but in my opinion it's not necessary. The waiting suffix will appear only very briefly, and probably only in relatively rare cases. It doesn't seem worth adding code to avoid it. -- Robert Haas EDB: http://www.enterprisedb.com
Re: CF entries for 17 to be reviewed
Hi, > > On 6 Mar 2024, at 18:49, Andrey M. Borodin wrote: > > > > Here are statuses for "Refactoring" section: > > [...] > Aleksander, I would greatly appreciate if you join me in managing CF. > Together we can move more stuff :) > Currently, I'm going through "SQL Commands". And so far I had not come to > "Performance" and "Server Features" at all... So if you can handle updating > statuses of that sections - that would be great. Server Features: * Fix partitionwise join with partially-redundant join clauses I see there is a good discussion in the progress here. Doesn't seem to be needing more reviewers at the moment. * ALTER TABLE SET ACCESS METHOD on partitioned tables Ditto. * Patch to implement missing join selectivity estimation for range types The patch doesn't apply and has been "Waiting on Author" for a few months now. Could be a candidate for closing with RwF. * logical decoding and replication of sequences, take 2 According to Tomas Vondra the patch is not ready for PG17. The patch is marked as "Waiting on Author". Although it could be withrowed for now, personally I see no problem with keeping it WoA until the PG18 cycle begins. * Add the ability to limit the amount of memory that can be allocated to backends v20231226 doesn't apply. The patch needs love from somebody interested in it. * Multi-version ICU By a quick look the patch doesn't apply and was moved between several commitfests, last time in "Waiting on Author" state. Could be a candidate for closing with RwF. * Post-special Page Storage TDE support A large patch divided into 28 (!) parts. Currently needs a rebase. Which shouldn't necessarily stop a reviewer looking for a challenging task. * Built-in collation provider for "C" and "C.UTF-8" Peter E left some feedback today, so I changed the status to "Waiting on Author" * ltree hash functions Marked as RfC and cfbot seems to be happy with the patch. Could use some attention from a committer? * UUID v7 The patch is in good shape. Michael argued that the patch should be merged when RFC is approved. No action seems to be needed until then. * (to be continued) -- Best regards, Aleksander Alekseev
Re: Make attstattarget nullable
On 3/12/24 13:47, Peter Eisentraut wrote: > On 06.03.24 22:34, Tomas Vondra wrote: >> 0001 >> >> >> 1) I think this bit in ALTER STATISTICS docs is wrong: >> >> - > class="parameter">new_target >> + SET STATISTICS { > class="parameter">integer | DEFAULT } >> >> because it means we now have list entries for name, ..., new_name, >> new_schema, and then suddenly "SET STATISTICS { integer | DEFAULT }". >> That's a bit weird. > > Ok, how would you change it? List out the full clauses of the other > variants under Parameters as well? I'd go with a parameter, essentially exactly as it used to be, except for adding the DEFAULT option. So the list would define new_target, and mention DEFAULT as a special value. > We have similar inconsistencies on other ALTER reference pages, so I'm > not sure what the preferred approach is. > Yeah, the other reference pages may have some inconsistencies too, but let's not add more. >> 2) The newtarget handling in AlterStatistics seems rather confusing. Why >> does it get set to -1 just to ignore the value later? For a while I was >> 99% sure ALTER STATISTICS ... SET STATISTICS DEFAULT will set the field >> to -1. Maybe ditching the first if block and directly checking >> stmt->stxstattarget before setting repl_val/repl_null would be better? > > But we also need to continue accepting -1 for default on input. The > current code achieves that, the proposed variant would not. > OK, I did not realize that. But then maybe this should be explained in a comment before the new "if" block, because people won't realize why it needs to be this way. > Note that this patch matches the equivalent patch for attstattarget > (4f622503d6d), which uses the same logic. We could change it if we have > a better idea, but then we should change both. > >> 0002 >> >> >> 1) I think InsertPgAttributeTuples comment probably needs to document >> what the new tupdesc_extra parameter does. > > Yes, I'll update that comment. > OK. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, > I took a closer look at this patch over the last couple days, and I did > a bunch of benchmarks to see how expensive the accounting is. The good > news is that I haven't observed any overhead - I did two simple tests, > that I think would be particularly sensitive to this: > > [...] Just wanted to let you know that v20231226 doesn't apply. The patch needs love from somebody interested in it. Best regards, Aleksander Alekseev (wearing a co-CFM hat)
Re: confusing `case when` column name
On Tuesday, March 12, 2024, adjk...@126.com wrote: > > Nee we change the title of the case-when output column? > > Choosing either a or b as the label seems wrong and probably worth changing to something that has no meaning and encourages the application of a column alias. David J.
Re: [PATCHES] Post-special page storage TDE support
Hi David, > I have finished the reworking of this particular patch series, and have tried > to > organize this in such a way that it will be easily reviewable. It is > constructed progressively to be able to follow what is happening here. As > such, > each individual commit is not guaranteed to compile on its own, so the whole > series would need to be applied before it works. (It does pass CI tests.) > > Here is a brief roadmap of the patches; some of them have additional details > in > the commit message describing a little more about them. > > These two patches do some refactoring of existing code to make a common place > to > modify the definitions: > > v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patch > v3-0002-refactor-Make-PageGetUsablePageSize-routine.patch > > These two patches add the ReservedPageSize variable and teach PageInit to use > to > adjust sizing accordingly: > > v3-0003-feature-Add-ReservedPageSize-variable.patch > v3-0004-feature-Adjust-page-sizes-at-PageInit.patch > > This patch modifies the definitions of 4 symbols to be computed based on > PageUsableSpace: > > v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patch > > These following 4 patches are mechanical replacements of all existing uses of > these symbols; this provides both visibility into where the existing symbol is > used as well as distinguishing between parts that care about static allocation > vs dynamic usage. The only non-mechanical change is to remove the definition > of > the old symbol so we can be guaranteed that all uses have been considered: > > v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patch > v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patch > v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patch > v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patch > > The following patches are related to required changes to support dynamic toast > limits: > > v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patch > v3-0011-feature-Dynamically-calculate-toast_tuple_target.patch > v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patch > v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patch > v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patch > > In order to calculate some of the sizes, we need to include nbtree.h > internals, > but we can't use in front-end apps, so we separate out the pieces we care > about > into a separate include and use that: > > v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patch > > This is the meat of the patch; provide a common location for these > block-size-related constants to be computed using the infra that has been set > up > so far. Also ensure that we are properly initializing this in front end and > back end code. A tricky piece here is we have two separate include files for > blocksize.h; one which exposes externs as consts for optimizations, and one > that > blocksize.c itself uses without consts, which it uses to create/initialized > the > vars: > > v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patch > > Add ControlFile and GUC support for reserved_page_size: > > v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patch > > Add initdb support for reserving page space: > > v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patch > > Fixes for pg_resetwal: > > v3-0019-feature-Updates-for-pg_resetwal.patch > > The following 4 patches mechanically replace the Dynamic form to use the new > Cluster variables: > > v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patch > v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patch > v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patch > v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patch > > Two pieces of optimization required for visibility map: > > v3-0024-optimization-Add-support-for-fast-non-division-ba.patch > v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patch > > Update bufpage.h comments: > > v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patch > > Fixes for bloom to use runtime size: > > v3-0027-feature-Teach-bloom-about-PageUsableSpace.patch > > Fixes for FSM to use runtime size: > > v3-0028-feature-teach-FSM-about-reserved-page-space.patch > > I hope this makes sense for reviewing, I know it's a big job, so breaking > things up a little more and organizing will hopefully help. Just wanted to let you know that the patchset seems to need a rebase, according to cfbot. Best regards, Aleksander Alekseev (wearing a co-CFM hat)
Re: Make attstattarget nullable
On 06.03.24 22:34, Tomas Vondra wrote: 0001 1) I think this bit in ALTER STATISTICS docs is wrong: - new_target + SET STATISTICS { integer | DEFAULT } because it means we now have list entries for name, ..., new_name, new_schema, and then suddenly "SET STATISTICS { integer | DEFAULT }". That's a bit weird. Ok, how would you change it? List out the full clauses of the other variants under Parameters as well? We have similar inconsistencies on other ALTER reference pages, so I'm not sure what the preferred approach is. 2) The newtarget handling in AlterStatistics seems rather confusing. Why does it get set to -1 just to ignore the value later? For a while I was 99% sure ALTER STATISTICS ... SET STATISTICS DEFAULT will set the field to -1. Maybe ditching the first if block and directly checking stmt->stxstattarget before setting repl_val/repl_null would be better? But we also need to continue accepting -1 for default on input. The current code achieves that, the proposed variant would not. Note that this patch matches the equivalent patch for attstattarget (4f622503d6d), which uses the same logic. We could change it if we have a better idea, but then we should change both. 0002 1) I think InsertPgAttributeTuples comment probably needs to document what the new tupdesc_extra parameter does. Yes, I'll update that comment.